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(57) Abstract 



The present invention relates to the HIV- 1 strains MN-STl and BA-L which are typical United States HIV-1 isotypes. Tfae 
present invention relates to DNA segments encoding the envelope protein of MN-STl or BA-L, to DNA constructs containing 
such DNA segments and to host cells transformed with such contructs. The viral isolates and envelope proteins of the present in- 
vention are of value for use in vaccines and bioassays for the detection of HIV-1 infection in biological samples, such as blood 
bank samples. 
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MQLECIIIAR CLONES OF HIV-1 AN D OSES THEREOF 

BACKGROUND OF THE INVENTION 
HIV-1 has been identified as the etiologic agent 
of the acquired immunodeficiency syndrome (AIDS) 
5 , (Barre-Sinoussi et al.. Science 220, 868-871, 1983; Popvic 
et al.. Science 224, 497-500, 1984; Gallo et al.. Science 
224, 500-503, 1984). Infected individuals generally 
develop antibodies to the virus within several months of 
exposiire (Sarngadharan et al.. Science 224, 506-508, 
10 1984), which has made possible the development of immuno- 
logically based tests which can identify most blood 
samples from infected individuals. This is a great 
advantage in diagnosis, and is vital to maintaining the 
maximum possible safety of samples from blood banks. 
15 An important aspect of HIV-1 is its genetic 

variability (Hahn et al., Proc, Natl. Acad. Sci. U.S.A. 
82, 4813-4817, 1985). This is particularly evident in the 
gene for the outer envelope glycoprotein (Starcich et al.. 
Cell 45, 637-648, 1986; Alizon et al.. Cell 46, 63-74, 
20 1986; Gurgo et al., Virology 164, 531-536, 1988). Since 
the outer envelope glycoprotein is on the surface of the 
virus particle and the infected cell, it is potentially 
one of the primary targets of the immune system, including 
the target of neutralizing antibodies and cytotoxic T 
25 cells. This variability may also lead to differences in 
the ability of antigens from different strains of HIV-l to 
be recognized by antibodies from a given individual, as 
well as to differences in the ability of proteins from 
different strains of virus to elicit an immune response 
30 which would be protective against the mixture of virus 
strains that exists in the at risk populations. 

Several biologically active complete molecular 
clones of various strains of HIV-l have been obtained and 
sequenced. These clones, however, seem to represent viral 
35 genotypes which are relatively atypical of United States 
HIV-l isolates. In addition, several of the translational 
reading frames for non-structural viral proteins are not 
complete. Further, viruses derived from these clones do 
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not grow in macrophages, in contrast to many HIV-1 field 
isolates and, perhaps, because of this lack of ability to 
infect macrophages efficiently, these clones do not repli-* 
cate well in chimpanzees. This latter ability is impor- 
5 tant for testing candidate vaccines in animal systems. In 
addition, the ability to infect macrophages is critical in 
evaluating, the possible protective efficacy of elicited 
immune response isince neutralization of infectivity on 
macrophage may differ from the better studied neutralize- 

10 tion on T cells. 

Neutralizing antibodies (Robert-Gurof f et al., 
Nature 316, 72-74, 1985; Weiss et al., Nature 316, 69-72, 
1985) have been demonstrated in infected individuals, as 
have cytotoxic T. cells responses (Walker et al. Nature 

15 328, 345-348, 1988). Although these do not appear to be 
protective, it is likely that if they were present prior 
to infection, they would prevent infection, especially by 
related strains of virus. This is supported by the 
finding that macaques cam be protected by immunization 

20 with inactivated simian immunodeficiency virus (SIV) from 
infection with the homologous live virus (Murphy-Corb et 
al., Science 246, 1293-1297, 1989). Chimps also have been 
passively protected against challenge by live virus by 
prior administration of neutralizing antibodies to the 

25 same virus (Emini et al., J. Virol. 64, 3674-3678, 1989). 

One problem, however, is that at least some of the neu- 
tralizing antibodies studied depend on recognition of a 
variable region on the envelope (Matsushita et al., J. 
Virol. 62, 2107-2114, 1988; Rusche et al«, Proc. Natl. 

30 Acad. Sci. U.S.A. 85, 3198-3202, 1988; Skinner et al., 
AIDS Res. Hum. Retroviruses 4, 187-197, 1988) called the 
V3 region (Starcich et al. , Cell 45, 637-648, 1986). 

An at least partial solution to the problem of 
viral heterogeneity is to identify prototypical HlV-1 

35 strains, that is, those that are most similar by DNA 
sequence data or serologic reactivity to strains present 
in the population at risk. The inclusion of a limited 
number of such prototype strains in a polyvalent vaccine 
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cocktail might then result in elicitation of an immiuie 
response protective against most naturally occurring 
viruses within a given population. Such a mixture should 
also provide the maximum possible sensitivity in diagnos- 
5 tic tests for antibodies in infected individuals. 

Components of highly representative isolates of a 
geographical area provide the maximum possible sensitivity 
in diagnostic tests and vaccines. Production of viral 
proteins from molecular clones by recombinant DNA tech- 

10 nic[ues is the preferred and safest means to provide such 
proteins. Molecular clones of prototype HIV-1 strains can 
serve as the material from which such recombinant proteins 
can be made. The use of recombinant DNA avoids any 
possibility of the presence of live virus and affords the 

15 opportunity of genetically modifying viral geiie products. 
The use of biologically active clones ensures that the 
gene products are functional and hence, maximizes their 
potential relevance. 

Infectious clones, that is, those which after 

20 transfection into recipient cells produce complete virus, 
are desirable for several reasons. One reason is that the 
gene products are by definition functional; this maximizes 
their potential relevance to what is occurring in vivo . A 
second reason is that genetically altered complete virus 

25 is easy to obtain. Consequently, the biological conse- 
quences of. variaibility can be easily assessed. For 
example, the effect of changes in the envelope gene on the 
ability of the virus to be neutralized by antibody can be 
easily addressed. Using this technique, a single point 

30 mutation in the envelope gene has been shown to confer 
resistance to neutralizing antibody (Reitz et al.. Cell 
54, 57-63, 1988). A third reason is that a clonal virus 
population provides the greatest possible definition for 
challenge virus in animals receiving candidate vaccines, 

35 especially those including components of the same molecu- 
lar ly cloned virus. 
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SDMMARY OF THE INVENTION 
It is an object of the present invention to 
provide vaccine components for an anti HIV-l vaccine which 
would represent a typical United States isolate HIV-1. 
5 It is another object of the present invention to 

provide diagnostic tests for the detection of HIV-1. 

Various other objects and advantages of the 
present invention will become apparent from the drawings 
and the following description of the invention. 
10 BRIEF DESCRIPTION OF THE DRAWINGS 

FIGURE 1 shows the structure and restriction map 
of the lambda HN-PHl clone. 

FIGURE 2 shows the restriction map of the HN-PHl 
envelope plasmid clone. 
15 FIGURE 3 shows the restriction map and structure 

of the lambda MN-STl clone. 

FIGURE 4 shows the structure of the lambda BA-L 

clone. 

FIGURE 5 shows the restriction map of the clone 

20 BA-Ll. 

Detailed Disclosure of the Invention 
The present invention relates to the HIV-l virus 
strains, MN-STl and BA*L, which are more typical of the 
HIV-1 isolates found in the United States than previously 

25 known HIV-*1 strains. Local isolates provide better 
material for vaccine and for the detection of the virus in 
biological samples, such as blood bank samples. 

The present invention relates to DNA segments 
encoding the env protein of HN-STl or BA-L (the DNA 

30 sequence given in Figures 5 and 8 being two such examples) 
and to nucleotide sequences complementary to the segments 
referenced above, as well as to other genes and nucleotide 
sequences contained in these clones. The present inven- 
tion also relates to DNA segments encoding a \mique 

35 portion of the MN-STl env protein or the BA-L env protein. 
(A "unique portion" consists of at least five (or six) 
amino acids or corresponding at least 15 (or 18) 
nucleotides.) 
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The invention further relates to the HIV-l virus 
strains MN-STl and BA-*L themselves. The HIV-l virus 
strains of the present Invention are biologically active 
and can easily be Isolated by one skilled In the art using 
5 known methodologies. 

The above-described DNA segments of the present 
Invention can be placed In DNA constructs which are then 
used In the transformation of host cells for a generation 
of recombinantly produced viral proteins, DNA constructs 

10 of the present invention comprise a DNA segment encoding 
the env protein and the flanking region of MN-STl (or BA- 
L) or a portion thereof and a vector. The constructs . can 
further comprise a second DNA segment encoding both a rev 
protein and a rev -responsive region of the env gene 

15 operably linked to the first DNA segment encoding the env 
protein. The rev protein facilitates efficient expression 
of the env protein in eucaryotic cells. Suitable vectors 
for use In the present invention include, but are not 
limited to, pSP72, lambda EHBL3 and SP65gpt. 

20 Host cells to which the present invention relates 

are stably transformed with the above-described DNA 
constxxicts. The cells are transformed under conditions 
such that the viral protein encoded in the transforming 
construct is expressed. The host cell can be procaryotic 

25 (such as bacterial), lower eucaryotic, (such as fungal, 
including yeast) or higher eucaryotic (such as mammalieui) • 
The host cells can be used to generate recombinantly 
produced MN-STl (or BA-L) env protein by culturlng the 
cells In a manner allowing expression of the viral protein 

30 encoded In the construct. The recombinantly produced 
protein is easily isolated from the host cells using 
standard protein isolation protocols. 

Since HIV-l strains MN-STl and BA-L represent 
relatively typical United States genotypes, non-infectious 

35 MN-STl or BA-L proteins (for example, the env protein) , 
peptides or unique portions of HN-STl or BA-L proteins 
(for example, a unique portion of the env protein) , and 
even whole inactivated MN-STl or BA-L can be used as an 
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inmunogen in nasmals, such as primates, to generate 
antibodies capable of neutralization and T cells capable 
of killing infected cells. The protein can be isolated 
from the virus or made recombinemtly from a cloned enve* 
5 lope gene. Accordingly, the virus and viral proteins of 
the present invention are of value as either a vaccine or 
a component thereof, or an agent in imnunotherapeutic 
treatment of individuals already infected with HIV-1. 

As is customary for vaccines, a non- infectious 
10 antigenic portion of MN-STl or BA-L, for example, the env 
protein, can be delivered to a mammal in a pharmacologi- 
cally acceptable carrier. The present invention relates 
to vaccines comprising non-infectious antigenic portions 
of either HN-STl or BA-L and vaccines comprising non- 
15 infectious antigenic portions of both MN-STl and. BA-L. 
Vaccines of the present invention can include effective 
amoiints of immunological adjuvants known to enhance an 
immune response. The viral protein or polypeptide is 
present in the vaccine in an amount sufficient to induce 
20 an imm\ine response against the antigenic protein and thus 
to protect against HIV-l infection. Protective antibodies 
are usually best elicited by a series of 2-3 doses given 
stbout 2 to 3 weeks apart. The series can be repeated when 
circulating antibody concentration in the patient drops. 
25 Virus derived from the infectious HIV-1 (MN) 

. clones, HN-STl, may also be used for reproducible chal- 
lenge experiments in chimpanzees treated with candidate 
HIV-1 vaccines or in vitro with human antiserum from 
individuals treated with candidate vaccines. A candidate 
30 vaccine can be administered to a test mammal, such as a 
chimpanzee prior to or simultaneously with the infectious 
MN-STl virus of the present invention. Effectiveness of 
the vaccine can be determined by detecting the presence or 
absence of HIV-1 infection in the test mammals. Side-by- 
35 side comparative tests can be run by further administering 
to a second set of test mammals the virus alone and 
comparing the number of infections which develop in the 
two sets of test mammals. Alternatively, candidate 
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vaccines can be evaluated in humans by administering the 
vaccine to a patient and then testing the ability of the 
MN-STl virus to infect blood cells from the patient. 

The present invention also relates to the detec- 
5 tion of HIV-l virus in a biological sample. For detection 
of an HIV-1 infection, the presence of the virus, proteins 
encoded in the viral genome, or antibodies to HIV-l is 
determined. Many types of tests, as one skilled in the 
art will recognize, can be used for detection. Such tests 
10 include, but are not limited to, ELISA and RIA, 

In one bioassay of the present invention all, or a 
unique pozrbion, of the env protein is coated on a surface 
and contacted with the biological sample. The presence of 
a resulting complex formed between the protein and anti- 
15 bodies specific therefor in the serum can be detected by 
any of the known methods commonly used in the art, such 
as, for example, fluorescent antibody spectroscopy or 
colorimetry. 

The following non-limiting examples are given to 
20 further demonstrate the present invention without being 
deemed limitative thereof. 

EXAMPLES 

MN-PHl Clone 

The permuted circular unintegrated viral DKA 
25 representing the complete HIV-l(HN) genome was cloned by 
standard techniques (Sambrook et al., 1989, Molecular 
Cloning. Cold Spring Harbor, New York: Cold Spring Harbor 
Laboratory Press) into the Eco RI site of lambda 
gtWES. lambda B DNA from total DNA of H9 cells producing 
30 HIV-l (MN). This clone is designated lambda MN-PHl, and 
its structure and restriction map are shown in Figure 1. 
The clone was subcloned into M13mpl8 and H13mpl9, and the 
DNA sequence of the entire clone, given in Figiire 2, was 
obtained by the dideoxy chain termination method (Sanger 
35 et al., Proc. Natl. Acad. Sci. U.S.A. 74, 5463-5467, 
1977} . The amino acid sequence of the envelope protein 
(see Table I) was inferred from the DNA sequence. A 
restriction map of the cloned unintegrated viral DNA (see 
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Figure 1) was also obtained from the DNA sequenqe of 
lambda PHI and used in conjunction with the inferred amino 
acid sequence of the viral proteins to subclone the 
envelope (env) gene into the commercially available 
5 plasmid pSP72 (Promega Biological Research Products, 
Madison, WI) , as shown in Figure 2. This plasmid (pMH- 
PHlenv) contains, in addition to the coding regions for 
the envelope proteins, the coding region for the rev 
protein (Feinberg et al,. Cell 46, 807-817, 1986) and the 

10 portion of the env gene which contains the rev-responsive 
region (Dayton et al«, J, Acguir. Immune. Defic. Syndr. 1, 
441-452, 1988), since both are necessary for efficient 
expression of the envelope protein in eucaryotic cells. 
This plasmid thus contains all the elements required for 

15 production of envelope protein following placement into 
appropriate eaqsression vectors and introduction into 
recipient cells, all by stsuidard techniques Icnown to 
molecular biologists. 
MH-STl Clone 

20 The infectious molecular clone, lambda HN-STl, was 

obtained by cloning integrated provirus from DNA purified 
from peripheral blood lymphocytes infected with HIV-1{MN) 
and maintained in culture for a short time (one month) • 
The integrated proviral DNA was partially digested with 

25 the restriction enzyme Sau3A under conditions which gave a 
maximum yield of DNA fragments of from 15-20 kilobases 
{Kb) . This was cloned into the compatible BaiDHI site of 
lambda EMBL3, as shown in Figure 3. Figure 3 also shows 
the restriction map of clone lambda MN-STl. The DNA 

30 sequence of the entire clone, given in Table II, was 
obtained by the dideoxy chain termination method (Sanger 
et al., Proc. Natl. Acad. Sci. U.S.A. 74, 5463-5467, 
1977) . The amino acid sequence was predicted from the DNA 
sequence (see Tad^le II) • This clone can be tremsfected 

35 into recipient cells by standard techniques. After 
transfection, the cloned proviral DNA is expressed into 
biologically active virus particles, which can be used as 
a source for virus stocks. The proviral DNA whose 
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restriction map is shown in Figure 2, was removed from the 
lambda phage vector by digestion with BamHI and inserted 
into a plasmid, SP65gpt (Feinberg et al., Cell 46, 807- 
817, 1986). This plasmid, pMN-STl, contains an SV40 
5 origin of replication. Consequently, transfection into 
COS-1 cells (Cluzman, Y. Cell 23, 175-182, 1981), which 
produce a SV40 gene product which interacts with the 
cognate origin of replication, results in a transient high 
plasmid copy ntimber with a concomitant production of large 
10 amount of replication competent, infectious virus 
(Feinberg et al., Cell 46, 807-817, 1986). This provides 
a convenient source of genetically homogeneous virus, as 
well as a way to introduce desired mutations using stan- 
dard methods. 

15 The envelope gene was excised from the lambda 

phage clone and cloned into a plasmid as described above 
for lambda MN-PHl. This clone (pMN-STlenv) , is similar to 
pMN-PHlenv, described above, except that it derives from a 
biologically active cloned provirus. Like pMN-PHlehv, it 

20 can be placed in a suitable vector and host to produce the 
envelope protein of HIV-l(MN) by well known techniques. 
BA-L Clone 

A Hind III fragment of unintegrated viral DNA 
representing the HIV-l(BA-L) genome was cloned by standard 

25 techniques into lambda phage Charon 28 DNA from total DNA 
of peripheral blood macrophages infected with and 
producing HIV-l (BA-L) . A positive clone was selected by 
hybridization using a radiolabelled probe for the HIV-l 
envelope. This clone, designated lambda BA-Ll, was found 

30 to contain the entire gene for the envelope protein. Its 
structure is given in Figure 4. The insert was trans- 
ferred into a plasmid (pBluescript, Stratagene, LaJolla, 
CA) and the DNA sequence of the env gene was determined 
(see Table III). This clone is designated pBA-Ll. 

35 The amino acid sequence of the envelope protein, 

shown in Table III, was inferred from the DNA sequence. A 
restriction map was also obtained from the DNA sequence of 
BA-Ll (shown in Figure 5) in order to determine the 
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appropriate restriction enzyme sites for cloning the env 
gene into suitable expression vectors. An Eco Rl-Hindlll 
fragment of 0.4 Kb and a 2.8 Kb Hindlll-Xbal fragment when 
cloned together constitute the entire env gene. This 
5 plasmid contains, in addition to the coding regions for 
the envelope proteins, the coding region for the rev 
protein and the portion of the env protein which contains 
the rev -respons ive region. Both are necessary for effi- 
cient expression of the envelope protein in euccuryotic 

10 cells (Feinberg et al.. Cell 46, 807-817, 1986; Dayton et 
al., J. Accpiir. Immune. Defic. Syndr. 1, 441-452). This 
plasmid thus contains all the HIV-1 genetic elements 
required for production of envelope protein following 
placement into appropriate expression vectors and intro- 

15 duction into recipient cells, all by standard techniques 
well known in the art. 
Statement of Deposit 

The lambda MN-STl clone and the BA-L plasmid 
clone were deposited at the American Type Culture Collec- 

20 tion (ATCC) , 12301 Parklawn Drive, Rockville, Maryland 
20852, U.S.A. , on September 13, 1990, under the terms of 
the Budapest Treaty. The lambda MN-STl clone has been 
assigned the ATCC accession number ATCC 40889 and the BA-L 
plasmid clone has been assigned the ATCC accession number 

25 ATCC 40890. 

* * * * * * 

All publications mentioned hereinabove are hereby 
incorporated by reference. 

While the foregoing invention has been described 
30 in some detail for purposes of clarity and understanding, 
it will be appreciated by one skilled in the art from a 
reading of this disclosure that various changes in form 
and detail can be made without departing from the true 
scope of the invention. 
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TGGMGG6CT AATTGACTCC CAACGAA6AC AA6ATATCCT T6ATCT6TG6 ATCTACCACA €0 
CACAAGGCTA CTTCCCT6AT TAGCA6AACT ACAGACCAGG 6CCAGG6ATC AGATATOCAC 120 
T6ACCTTTGG ATGGTGCTAC AAGCTAGTAC CAGTT6AGCC A6A6AAGTTA 6AAGAAGCCA 180 
ACAAAGGAGA GAACA0GA6C TTGTTACACC CTGTGAGCCT GCAT6GAATG GATGACCOGG 240 
AGA6AGAAGT GTTAGAGTGG AG6TTTGACA GCCGCCTAGC ATTTCATCAC ATGGCCCGAG 300 
AGCTGGATCC GGAGTACTTC AAGAACTGCT GAGATOGAGC TTGCTACAAG GGACTTTCCG 360 
CTGGGGACTT TCCAGGGAGG CGT6GCCTGG GCGGGACIGG GGAGTGGCGA GCCCTGAGAT 420 
OCTGCATATA AGCAGCTGCX TTTTGCCTGT ACTGGGTCTC TCTGGTTA6A CCAGATCTGA 480 
GCCTGGGAGC TCTCTGGCTA ACTAGGGAAC CGACTGCTTA AGCCTCAATA AAGCTT6CCT 540 
TGAGTGCTTC AAGTAGTGTG TGCCOGTCTG TTATGTGACT CTGGTAGCTA GAGATCCCTC 600 
AGATCCTTTT AGGCAGTGTG GAAAATCTCT AGCAGT6G0G CCCGAACAG6 GACTTGAAAG 660 
C6AAAGAAAA ACGAGAGCTC TCTOGAOGCA GGACTCGGCT TGCTGAAGCG CGCACGGCAA 720 
GAGGCGAGGG 6CGGCGACTG GTGAGTACGC CAAAAATTCT TGACTAGCGG AGGCTA6AAG 780 
GAGAGAGATG GGTGCGAGAG CGTCGGTATT AAG06GGGGA GAATTAGATC GATGGGAAAA 840 
CATTGGGTTA AGGCGAGGGG GAAAGAAAAA ATATAAATTA AAACATGTAG TATGGGCAAG 900 
CAGGGAGCTA GAA0GATT06 CAGTCAATCC TGGCCTGTTA GAAACATCAG AAG6CT6TAG 960 
AGAAATAC7G GGACAGCTAC AACCATCCCT TCAGACAGGA TCAGAAGAAC TTAAATCATT 1020 
ATATAATACA GTAGCAACCC TCTATTGTGT 6CATCAAAAG ATAGAGATAA AAGACACCAA 1080 
GGAAGCTTTA GAGAAAATAG AGGAAGAGCA AAACAAAAGT AAGAAAAAAG CAGAGCAA6C 1140 
AGCAGCTGAC ACAGGAAACA 6AGGAAACAG CAGCCAAGTC AGCCAAAATT ACCCCATA6T 1200 
GCAGAACATC GA6GGGCAAA TGGTACATCA GGCCATATCA CCTAGAACTT TAAATGCATG 1260 
GGTAAAAGTA GTAGAAGAGA AGGCTTTCAG CCCAGAAGTA ATACCCATGT TTTCA6CATT 1320 
ATGA6AAGGA GCGACCCCAC AAGATTTAAA CACCATGCTA AACACAGTGG 6GGGACATCA 1380 
AGCAGCGATG GAAATGTTAA AAGAGACCAT CAATGAGGAA GCTGGAGAAT GGGATAGATT 1440 
GCATCCAGTG CATGCAGGGC CTATTACACC AGGCCAGATG AGAGAACCAA GGGGAAGTGA 1500 
CATAGGAGGA ACTACTAGTA CCCTTCAGGA ACAAATAGGA TGGATGACAA ATAATCCACC 1560 
TATCCCAGTA GGAGAAATCT ATAAAAGATG GATAATCCTG GGATTAAATA AAATAGTAAG 1620 
GATGTATAGC CCTTCCAGCA TTCTGGACAT AAGACAAGGA CCAAA6GAAC CCTTTAGAGA 1680 
CTAT6TAGAC CGGTTCTATA AAACTCTAAG AGCC6AGCAA GCTTCACA6G AGGTAAAAAA 1740 
COGGAOGAGA GAAACCTTGT TGGTCCAAAA TGCGAACCCA GATTGTAAGA CTATTTTAAA 1800 
AGCATTGGGA CCAGCAGCTA CACTAGAAGA AATCATGACA GCATGTCAGG GAGT6GGAG6 1860 
ACCTGGTCAT AAAGCAAGA6 TTTTGGCGGA AGCGATGAGC CAAGTAACAA ATTCA6CTAC 1920 
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^0mm^mm9^m^l0mm^mmmmm 


CAGAGCCAAC 


2160 


A6CCCCACC& 


GAAGAGAGCT 


TCAGGTTTG6 

A^*»«fl»^W A A A^»%» 


GGAI^AGACA 


ACAACTCCCT 


ATCA6AAGCA 


2220 


GGAfiAAGAAG 


CA6GAGAOGA 


TAGACAAGGA 


CCTGTATCCT 


TTAGCTTCCC 


TCAAATCACT 


2280 


CTTTGGCAAC 


GACCCATTGT 


CACAATAAAG 


ATA6GGGGGC 


AACTAAAGGA 


AGCTCTATTA 


2340 


GATACA6GAG 


CAGATGATAC 


AGTATTAGGA 


GAAATGAATT 


TGCCAAGAAG 


ATGGAAACCA 


2400 


AAAATGATA6 


GGGGAATTGG 


AGGT7TTATC 


AAAGTAAGAC 


AGTATGATCA 


GATAACCATA 


2460 


GGAATCTGTG 


GACATAAAGC 


TATAGGTACA 


GTATTAGTAG 


GACCTACACC 


TGTCAACATA 


2520 


ATTCGAAGAA 


ATCTGTTGAC 


TCAGCTTGGG 


TGCACTTTAA 


ATTTTCCCAT 


TAGTCCTATT 


2580 


GAAACT6TAC 


CAGTAAAATT 


AAAGCCAGGA 


ATGGATGGCC 


CAAAAGTTAA 


ACAATGGCCA 


2640 


TTGACAGAAG 


AAAAAATAAA 


AGCATTAATA 


GAAATTTGTA 


CAGAAATGGA 


AAAGGAAGGG 


2700 


AAAATTTCAA 


AAATTGGGCC 


TGAAAATCCA 


TACAATACTC 


CAGT ATTT6 C 


CATAAAGAAA 


2760 


AAAGACAGTA 


CTAAATGGAG 

%^ A**'*'* A^9^V**>%7 


AAAATTAGTA 


GATTTCAGAG 


AACTTAATAA 


GAAAACTCAA 


2820 


GACTTCTCGG 


AAGTTCAATT 


AGGAATACCA 


CATCCTGCAG 


GGTTAAAAAA 


GAAAAAATCA 


2880 


GTAACAGTAC 


TGGATGTGGG 


TGATGCATAT 


TTTTCAGTTC 


CCTTAGATAA 


AGACTTCAGG 


2940 


AAGTATACTG 


CAT^FTACCAT 


ACCTAGTATA 


AACAATGAAA 


CACGAGGGAT 


TAGATATCAG 


3000 


TACAAT6TGC 


TTCCACAGG6 


ATGGAAAGGA 


•TCACCAGCAA 


TATTCCAAAG 

A«* A A ^^^^mmX^mm^^ 


TA6CAT6ACA 


3060 


AAAATCTTAG 


AGCCTTfFTAG 


AAAACAAAAT 


CCAGACATAG 


TTATCTATCA 


ATACATGGAT 


3120 


GATTTGTATG 


TAGGATCTGA 


CTTAGAAATA 


GGGCAGCATA 


GAGCAAAAAT 


AGAGGAACTG 


3180 


AGAGGACATC 


TGTTGAGGTG 


66GATTTACC 


ACACCAGACA 


AAAAACATCA 


GAAAGAACCT 


3240 


CCATTCCTTT 

^^^^C* A A ^^%^ AAA 


GGATGGGTTA 


TGAACTCCAT 


CCTGATAAAT 


GGACAGTACA 


GCCTATAGTG 


3300 


CTACCAGAAA 


AAGACAGCTG 


GACTGTCAAT 


GACATACAGA 


AGTTAGTGGG 


AAAATTGAAT 


3360 


TGGGCAAGTC 


AGATTTACGC 


AGGGATTAAA GTAAAGCAAT 


TATGTAAACT 


CCTTAGAGGA 


3420 


ACCAAAGCAC 


TAACAGAAGT 


AATACCACTA ACAGAAGAAG 


CAGAGCTAGA 


ACTGGCAGAA 


3480 


AACAGGGAAA 


TTCTAAAAGA 


ACCAGTACAT 


GGAGTGTATT 


ATGACCCATC 


AAAAGACTTA 


3540 


ATAGCAGAAG 


TACAGAAGCA 


GGGGCAAGGC 


CAATGGACAT 


ATCAAATTTA 


TCAAGAGCCA 


3600 


TTT2UVAAATC 


TGAAAACAGG 


CAAATATGCA AGAATGAGGG 


GTGCCCACAC 


TAATGATGTA 


3660 


AAACAATTAA 


CAGAGGGAGT 


GCftAAAAATA 6CCACAGAAA 


GCATAGTAAT 


ATGGGGAAAG 


3720 


ACTCCTAAAT 


TTAGACTACC 


CATACAAAAA 6AAACATCGG 


AAACATGGTG 


GAGA6AGTAT 


3780 


A0GTAA60CA 


CCTGGATTCC 


TGAGTGGGAG GTTGTCAATA 


CCCCTCCCTT 


AGT6AAATTA 


3840 


TGGTACGAGT 


TA6A6AAAGA 


ACCCATAGTA GGTGCAGAAA 


CTTTCTATGT 


AGATGGGGCA 


3900 


GCTAACAGG6 


A6ACTAAAAA 


AGGAAAAGGA GGATAT6TTA 


CTAACAGAGG 


AAGACAAAAG 


3960 
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GTTGTCTOCC TAACT6ACAC MCMATCHG 
TT6CAA6ATT CAGGGTTA6A A6TAAAGATA 
ATTGMGGAC AACCAGATAA AA6T0AATCA 
ATAA3UUUU36 AAAA6GTCTA TCT66CAT66 
GAACAAGTA6 ATAAATTA6T CA6TGCTGGA 
GATAAGGCCC AA6AAGACCA TGAGAAATAT 
TTTAACCTAC CACCTATAGT AGCAAAAGAA 
AAAGGAGAA6 CCATGCAT6G ACAA6TAGAC 
ACACATTTA6 AA6GAAAAGT TATCCT6GTA 
GCAGAAGTTA TTCCAGGAGA GACAGGGCAG 
GGAAGATGGC CAGTAAAAAC AATACATACA 
6TTAAGGCC6 CCTGTT6GTG GACGGGAATC 
CAAAGTCAAG GAGTAATAGA ATCTATGAAT 
AGAGATCAGG CTGAACATCT TAAGAGAGCA 
AAAA6AAAAG GGG6GATTG6 GGGGTACAGT 
ACAGACATAC AAACTAAAGA ACTACAAAAA 
TATTACAGGG ACAGCAGAGA TCCACTTTGG 
GAAGGGGCAG TAGTAATACA AGATAATAAT 
AAGGTCATTA GGGATTATGG AAAACAGACG 
GAT6AGGATT AGAACATCGA AAAGTTTAGT 
TAAAGGACGG TTTTATAGAC - AtCACTATGA 
ACACATOCCA CTAGGGGATG CTAGATTGGT 
AGAAAGAGAC TGGCATTTAG GTCAGGGAGT 
CACACAAGTA GACCCTGACC TAGCAGACGA 
TTGAGACTCT GGCATAAGAA AGGCCATATT 
TCAAGCAGGA CATAACAA6G TAGGACCTCT 
ACCAAAAAAG ATAAAGCCAC CTTTGCCTA6 
CAAGCCCCAG AAGACCAAGG GCCACAGAGG 
TTAGA6GAGC TTAAGAATGA AGCTGTTAGA 
GGGCAACATA TCTATGAAAC TTATGGGGAT 
ATTCTACAAC AAC3GCTGTT TATTCATTTC 
ATTATTCGAC AGAGGAGAGC AAGAAATGGA 
GCATCCAGGA AGTCAGCCTA AGACT6CTTG 
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AAGACTGAGT 


TACAAGCAAT 


TCATCXAGCT 


4020 


GTAACAGACT 


CACAATATGC 


ATTA6GAATC 


4080 


GAGTTAGTCA 


GTCAAATAAT 


A6AGCAGTTA 


4140 


GTACCAGCAC 


ACAAAGGAAT 


TGGAGGAAAT 


4200 


ATCAGGAAAG 


TACTATTTTT 


AGATGGAATA 


4260 


GACAGTAATT 


GGAGAGCAAT 


GGCTAGTGAC 


4320 


ATAGTAGCCA 


GCTGTGATAA 


ATGTCAGCTA 


4380 


TGTAGTCCAG 


GAATATGGCA 


ACTAGATTGT 


4440 


GCA6TTCATG 


TAGCCAGTGG 


ATACATA6AA 


4500 


GA6ACAGCAT 


ACTTTCTC7T 


AAAATTAGCA 


4560 


GACAATGGCC 


CCAATTTCAC 


CAGTACTACG 


4620 


AAGCAGGAAT 


TTGGCATTCC 


CTACAATCCC 


4680 


AAA6AATTAA 


AGAAAATTAT 


AGGACAGGTA 


4740 


GTACAAATGG 


CAGTATTCAT 


CCACAATTTT 


4800 


GCAGGGGAAA 


GAATAGTAGG 


CATAATAGCA 


4860 


CAAATTACAA 


AAATTCAAAA 


TTTTCGGGTT 


4920 


AAAGGACCAG 


CAAAGCTTCT 


CTGGAAAGGT 


4980 


GACATAAAAG 


TAGTGCCAAG 


AAGAAAAGCA 


5040 


GCAGGTGATG 


ATTGTGTGGC 


AAGCAGACAG 


5100 


AAAACACCAT 


AT6TATATTT 


CAAAGAAAGC 


5160 


AAGCACTCAT 


CCAAGAATAA 


GTXCAGAAGT 


5220 


AATAACAACA 


TATTGGGGTC 


TGCATAGAGG 


5280 


CICCATAGAA 


TGGAGGAAAA 


AGAGATATAG 


5340 


CCTAATTCAT 


CTGCATTACT 


TTGATTGTTT 


5400 


AGGACATAGA 


GTTAGTCCTA 


TTTGTGAATT 


5460 


ACAGTACTTG 


GCACTAACAG 


CATTAATAAC 


5520 


TCTTAAGAAA 


CT6ACAGAGG 


ATAGATGGAA 


5580 


GAGCGATACA 


ATCAATGGGC 


ACTAGAGCTT 


5640 


CATTTTCCTA 


GGATATGGCT 


CCATGGCTTA 


5700 


ACTT6GGCAG 


GAGTGGAAGC 


CATAATAA6A 


5760 


AGAATTGGGT 


GTCGACATAG 


CAGAATAGGC 


5820 


GCCAGTAGAT 


CCTAGACTAG 


AGCCCT6GAA 


5880 


TACCACTT6C 


TATTGTAAAA 


AGTGTTGCTT 


5940 
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TCATTGCCAA 6TTT6TTTCA. dUlAAAAAGC CTTAG6CATC TCCTAT6GGA 66AAGAAGCG 6000 

OAGAGAGOGA C6MGA6CTC CTGi^GACAG TGAGACTGAT GAA6TTTCTC TACCAAAGCA 6060 

GTAA6TA6TA CATGTAATGC AACCTTTAGT AATA6GAGCA AXAGTAGCAT TAGTAGTA6C 6120 

AGGAAXMTA GCAATAGTTG TGTGATCCAT AGTATTCATA 6AATATAGGA AAATAAGAAG 6180 

ACAAAGAAAA ATAGACAGGT TAATTGATAG AATAftGOGAA AGAGGAGAA6 ACAGT6GCA 6239 

ATG AGA GTG AAG 6GG ATC AGG AGG AAT TAT GAG GAG TGG TGG GGA TGG 6287 
M6t Arg Val Lys Gly lie Arg Arg Asn Tyr Gin His Trp Trp Gly Trp 
1 5 10 15 

G6C AGG ATG CTC CTT GGG TTA TTA ATG ATC TGT AGT GCT ACA GAA AAA 6335 
Gly Thr Met Leu Leu Gly Leu Leu Met He Cys Ser Ala Thr Glu Lys 
20 25 30 

TTG TGG GTC ACA GTC TAT TAT GGG 6TA CCT GTG TGG AAA GAA GCA ACC 6383 
Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala Thr 
35 40 45 

ACC ACT CTA TTT TGT GCA TCA GAT GCT AAA GCA TAT GAT ACA GAG GTA 6431 
Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys Ala Tyr Asp Thr Glu Val 
50 55 60 

CAT AAT GTT TGG GCC ACA CAA GCC TGT GTA CCC ACA GAC CCC AAC CCA 6479 
His Asn Val Trp Ala Thr Gin Ala Cys Val Pro Thr Asp Pro Asn Pro 
65 70 75 . 80 

GAA GAA GTA GAA TTG GTA AAT GTG ACA GAA AAT TTT AAC ATG TGG A2^A 6527 
Gin Glu Val Glu Leu Val Asn Val Thr Glu Asn Phe Asn Met Trp Lys 
85 90 95 

AAT AAC ATG GTA GAA CAG ATG CAT GAG GAT ATA ATC AGT TTA TGG GAT 6575 
Asn Asn Met Val Glu Gin Met His Glu Asp He lie Ser Leu Trp Asp 
100 105 110 

CAA AGC CTA AAG CCA TGT GTA AAA TTA ACC CCA CTC TGT GTT ACT TTA 6623 
Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Thr Leu 
115 120 125 

2U1T TGC ACT GAT TTG AGG AAT ACT ACT AAT ACC AAT AAT AGT ACT GCT 6671 
Asn Cys Thr Asp Leu Arg Asn Thr Thr Asn Thr Asn Asn Ser Thr Ala 
130 135 140 

AAT AAC AAT AGT AAT AGC GAG GGA ACA ATA AAG GGA GGA GAA ATG AAA 6719 
Asn Asn Asn Ser Asn Ser Glu Gly Thr Zle Lys Gly Gly Glu Met Lys 
145 150 155 160 

AAC TGC TCT TTC AAT ATC ACC ACA AGC ATA AGA GAT AAG ATG CAG AAA 6767 
Asn C^s Ser Phe Asn He Thr Thr Ser He Arg Asp Lys Met Gin Lys 
165 170 175 

GAA TAT GCA CTT CTT TAT AAA CTT GAT ATA GTA TCA ATA GAT AAT GAT 6815 
Glu Tyr Ala Leu Leu Tyr Lys Leu Asp He Val Ser He Asp Asn Asp 
180 185 190 

AGT ACC AGC TAT AGG TTG ATA AGT TGT AAT ACC TCA GTC ATT ACA CAA 6863 
Ser Thr Ser Tyr Arg Leu Zle Ser Cys Asn Thr Ser Val He Thr Gin 
195 200 205 

GCT TGT CCA AAG ATA TCC TTT GAG CCA ATT CCC ATA CAC TAT TGT GCC 6911 
Ala Cys Pro Lye He Ser Phe Glu Pro He Pro He His Tyr Cys Ala 
210 215 220 
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0C6 OCT G6T TTT 6C6 ATT CTA AAA TOT AAC GAT AAA AA6 TTC ACT 6GA 6959 
Pro Ala 61y Phe Ala lie Leu Lye eye Aon Aep Lye Lye Phe Ser Gly 
225 230 235 240 



AAA 6GA TCA TGT AAA AAT 6TC AGC ACA GTA GAA TGT ACA CAT 6GA ATT 7007 
Lye Oly Ser Cye Lye Aen Val Ser Thr Val Gin Cye Thr His Gly lie 
245 250 255 

AGG CCA GTA GTA TCA ACT CAA CT6 CTG TTA AAT GGC AGT CTA GCA GAA 705S 
Arg Pro Val Val ser Thr Gin Leu Leu Leu Aen Gly Ser Leu Ala Glu 
260 265 270 

GAA GAG GTA GTA ATT AGA TCT GAG AAT TTC ACT GAT AAT OCT AAA ACC 7103 
Glu Glu Val Val lie Arg Ser Glu Asn Phe Thr Asp Aen Ala Lys Thr 
275 280 285 

ATC ATA GTA CAT CTG AAT GAA TCT GTA CAA ATT AAT TGT ACA AGA CCC 7151 
He He Val Hie Leu Asn Glu Ser Val Gin He Asn Cys Thr Arg Pro 
290 295 300 

AAC TAC AAT AAA AGA AAA AGG ATA CAT ATA 6GA CCA GGG AGA GCA TTT 7199 
Asn Tyr Asn Lye Arg Lys Arg He His He Gly Pro Gly Arg Ala Phe 
305 310 315 320 

TAT ACA ACA AAA AAT ATA ATA 6GA ACT ATA AGA CAA GCA CAT TGT AAC 7247 
Tyr Thr Thr Lys Asn He He Gly Thr He Arg Gin Ala His Cys Asn 
325 330 335 

ATT AGT AGA GCA AAA TGG AAT GAC ACT TTA AGA CAG ATA 6TT AGC AAA 7295 
He Ser Arg Ala Lys Trp Asn Asp Thr Leu Arg Gin He Val Ser Lys 
340 345 350 

TTA AAA GAA CAA TTT AAG AAT AAA ACA ATA GTC TTT AAT CAA TCC TCA 7343 
Leu Lys Glu Gin Phe Lys Asn Lys Thr He Val Phe Asn Gin Ser Ser 
355 360 365 

GGA GGG GAC CCA GAA ATT GTA ATG CAC AGT TTT AAT TGT GGA GGG GAA 7391 
Gly Gly Asp Pro Glu He Val Met His Ser Phe Asn Cys Gly Gly Glu 
370 375 380 



TTT TTC TAC TGT AAT ACA TCA CCA CTG TTT AAT AGT ACT TGG AAT GGT 7439 
Phe Phe Tyr Cys Asn Thr Ser Pro Leu Phe Asn Ser Thr Trp Asn Gly 
385 390 395 400 

AAT AAT ACT TGG AAT AAT ACT ACA GGG TCA AAT AAC AAT ATC ACA CTT 7487 
Asn Asn Thr Trp Aen Asn Thr Thr Gly Ser Asn Asn Asn He Thr Leu 
405 410 415 

CAA TGC AAA ATA AAA CAA ATT ATA AAC ATG TGG CAG GAA GTA GGA AAA 7535 
Gin Cye Lys He Lys Gin He He Asn Met Trp Gin Glu Val Gly Lys 
420 425 430 

GCA ATG TAT 6CC CCT CCC ATT GAA GGA CAA ATT AGA TGT TCA TCA AAT 7583 
Ala Met Tyr Ala Pro Pro He Glu Gly Gin He Arg Cys Ser Ser Asn 
435 440 445 

ATT ACA GGG CTA CTA TTA ACA AGA GAT GGT GGT AAG GAC ACG GAC ACG 7631 
He Thr Gly Leu Leu Leu Thr Arg Asp Gly Gly Lys Asp Thr Asp Thr 
450 455 460 

AAC GAC ACC GAG ATC TTC AGA CCT GGA GGA GGA GAT ATG AGG GAC AAT 7679 
Asn Asp Thr Glu He Phe Arg Pro Gly Gly Gly Asp Met Arg Asp Asn 
465 470 475 480 
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TOO MA A6T GAA TTA TAT AAA TAT AAA GTA 6TA ACA ATT GAA CCA TTA 7727 
Trp Arg Ser 61u Lau Tyr Lye Tyr Lys Val Val Thr He Glu Pro Leu 
485 490 495 

G6A GTA GCA CCC ACC AAG 6GA AAG AGA AGA GTG 6TG CAG A6A GAA AAA 7775 
Gly Val Ala Pro Thr Z.y8 Ala Lys Arg Arg Val Val Gin Arg Glu Lye 
500 505 510 

AGA GCA GCG ATA GGA GCT CTG TTC CTT GGG TTC TTA GGA GCA GCA G6A 7823 
Arg Ala Ala He Gly Ala Leu Phe Leu Gly Phe Leu Gly Ala Ala Gly 
515 520 525 



A6C ACT ATG 6GC GCA GGG TCA GTG ACG CTG ACG GTA CAG GCC AGA CTA 7871 
Ser Thr Met Gly Ala Ala Ser Val Thr Leu Thr Val Gin Ala Arg Leu 
530 535 540 

TTA TTG TCT GGT ATA GTG CAA CAG CAG AAC AAT TTG CTG AGG GCC ATT 7919 
Leu Leu Ser Gly He Val Gin Gin Gin Asn Aan Leu Leu Arg Ala He 
545 550 555 560 

GAG GCG CAA CAG CAT ATG TTG CAA CTC ACA GTC TGG 6GC ATC AAG CAG 7967 
Glu Ala Gin Gin Hie Met Leu Gin Leu Thr Val Trp Gly He Lys Gin 
565 570 575 

CTC CAG GCA AGA GTC CTG GCT GTG GAA AGA TAC CTA AAG GAT CAA CAG 8015 
Leu Gin Ala Arg Val Leu Ala Val Glu Arg Tyr Leu Lys Asp Gin Gin 
580 585 590 

CTC CTG GGG TTT TGG GGT TGC TCT GGA AAA CTC ATT TGC ACC ACT ACT 8063 
Leu Leu Gly Phe Trp Gly Cys Ser Gly Lys Leu He Cys Thr Thr Thr 
595 600 605 

GTG CCT TGG AAT GCT AGT TGG AGT AAT AAA TCT CTG GAT GAT ATT TGG 8111 
Val Pro Trp Asn Ala Ser Trp Ser Asn Lys Ser Leu Asp Asp He Trp 
610 615 620 

AAT AAC ATG ACC TGG ATG CAG TGG GAA AGA GAA ATT GAG AAT TAC ACA 8159 
Asn Asn Met Thr Trp Met Gin Trp Glu Arg Glu He Asp Aan Tyr Thr 
625 630 635 640 

AGC TTA ATA TAC TCA TTA CTA GAA AAA TCG CAA ACC CAA CAA GAA AAG 8207 
Ser Leu He Tyr Ser Leu Leu Glu Lys Ser Gin Thr Gin Gin Glu Lys 
645 650 655 

AAT GAA CAA GAA TTA TTG GAA TTG GAT AAA TGG GCA AGT TTG TGG AAT 8255 
Asn Glu Gin Glu Leu Leu Glu Leu Asp Lys Trp Ala Ser Leu Trp Asn 
660 665 670 



TGG TTT GAC ATA ACA AAT TGG CTG TGG TAT ATA AAA ATA TTC ATA ATG 8303 
Trp Phe Asp He Thr Asn Trp Leu Trp Tyr He Lys He Phe He Met 
675 680 685 

ATA GTA GGA GGC TTG GTA GGT TTA AGA ATA GTT TTT GCT GTA CTT TCT 8351 
He Val Gly Gly Leu Val Gly Leu Arg He Val Phe Ala Val Leu Ser 
690 695 700 

ATA GTG AAT AGA GTT AGG CAG GGA TAC TCA CCA TTG TCG TTG CAG ACC 8399 
He Val Asn Arg Val Arg Gin Gly Tyr Ser Pro Leu Ser Leu Gin Thr 
705 710 715 720 

CGC CCC CCA GTT CCG AGG GGA CCC GAC AGG CCC GAA GGA ATC GAA GAA 8447 
Arg Pro Pro Val Pro Arg Gly Pro Asp Arg Pro Glu Gly He Glu Glu 
725 730 735 
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GAA GGT G6A GAG AGA GAG AGA GAC ACA TCC GGT C6A TTA GTG CAT GGA 8495 
Glu Gly Gly Glu Arg Asp Arg Asp Thr Ser Gly Arg Leu Val His Gly 
740 745 750 

TTC TTA GGA ATT ATC TGG GTC GAC CTG CGG AGC CTG TTC CTC TTC AGC 8543 
Phe Leu Ala Zle lie Trp Val Asp Leu Arg Ser Leu Phe Leu Phe Ser 
755 760 765 

TAG CAC CAC AGA GAC TTA CTC TTG ATT GCA GC6 A6G ATT GTG GAA CTT 8591 
Tyr His His Arg Asp Leu Leu Leu lie Ala Ala Arg Zle Val Glu Leu 
770 775 780 

CTG GGA CGC AGG GGG TGG GAA GTC CTC AAA TAT TGG TGG AAT CTC CTA 8639 
Leu Gly Arg Arg Gly Trp Glu Val Leu Lys Tyr Trp Trp Asn Leu Leu 
785 790 795 800 

GAG TAT TGG AGT CAG GAA CTA AAG AGT ACT GCT GTT AGC TTG CTT AAT 8687 

Gin Tyr Trp Ser Gin Glu Leu Lys Ser Ser Ala Val Ser Leu Leu Asn 
805 810 815 



GCC ACA GCT ATA GCA GTA GCT GAG GGG ACA GAT AGG GTT ATA GAA GTA 8735 
Ala Thr Ala lie Ala Val Ala Glu Gly Thr Asp Arg Val He Glu Val 
820 825 830 

CTG CAA AGA GCT GGT AGA GCT ATT CTC CAC ATA CCT ACA AGA ATA AGA 8783 
Leu Gin Arg Ala Gly Arg Ala He Leu His lie Pro Thr Arg He Arg 
835 840 845 

CAG GGC TTG GAA AGG GCT TTG CTA TAAGATGGGT GGCAAATGGT CAAAACGTGT 8837 
Gin Gly Leu Glu Arg Ala Leu Leu 
850 855 

6ACTGGATGG CCTACTGTAA GGGAAAGAAT 6AGACGAGCT GAACCAGCTG AGCTAGCAGC 8897 

AGATGGGGTG GGAGCAGCAT CCCGAGACCT GGAAAAACAT GGAGCACTCA CAAGTAGCAA 8957 

TACAGCA6CT ACCAATGCTG ATTGTGCCTG GCTAGAA6CA CAAGAGGAGG AGGAAGTGGG 9017 

TTTTCCAGTC AAACCTCA6G TACCTTTAAG ACCAATGACT TACAAAGCAG CTTTAGATCT 9077 

TAGCCACTTT TTAAAAGAAA AGGGGGGACT GGATGGGTTA ATTTACTCCC AAAAGAGACA 9137 

AGACATCCTT GATCTGTGGG TCTACCACAC ACAAGGCTAC TTCCCTGATT GGCAGAACTA 9197 

CACACCAGGG CCAGGGATCA GATATCCACT GACCTTTGGA TGGTGCTTCA AGCTAGTACC 9257 

AGTTGA6CCA GAGAA6ATAG AA6AGGCCAA TAAAGGAGAG AACAACTGCT TGTTACACCC 9317 

TATGAGOCAG CAT6GAT6GA TGACCCGGAG AGAGAAGTGT TAGTGTGGAA GTCTGACAGC 9377 

CACCTAGCAT TTCAGCATTA TGCCCGAGAG CTGCATCCGG AGTACTACAA GAACTGCTGA 9437 

GATCGAGCTA TCTAGAAGGG ACTTTCCGCT GGGGACTTTC CA6GGAGGTG TGGCCTGGGC 9497 

GGGACCGGGG AGTGOCGAGC CCTCAGATCG TGCATATAAG CAGCTGCTTT CTGCCTGTAC 9557 

TGGGTCTCTC TGGTTAGACC AGATCTGAGC CTGGGAGCTC TCTGGCTAAC TA6GGAACCC 9617 

ACTGCTTAAG CCTCAATAAA GCTTGCCTTG AGTGCTTCAA GTAGTGTGTG CCCGTCTGTT 9677 

ATGTGACTCT GGTAGCTAGA GATCCCTCAG ATCCTTTTAG GCAGTGTGGA AAATCTCTAG 9737 

CA 9739 

Met Arg Val Lys Gly He Arg Arg Asn Tyr Gin His Trp Trp Gly Trp 
15 10 15 
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Gly Thr Met Leu Leu Gly Leu Leu Met lie Cye Ser Ala Thr Glu Lye 
20 25 30 

Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala Thr 
35 40 45 

Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys Ala Tyr Asp Thr Glu Val 
50 ' 55 60 

Hie Aen Val Trp Ala Thr Gin Ala Cys Val Pro Thr Asp Pro Asn Pro 
65 70 75 80 

Gin Glu Val Glu Leu Val Asn Val Thr Glu Asn Phe Asn Met Trp Lys 
85 90 95 

Asn Asn Met Val Glu Gin Met Bis Glu Asp He He Ser Leu Trp Asp 
100 105 110 

Gin Ser Leu Lys Pro Cye Val Lys Leu Thr Pro Leu Cys Val Thr Leu 
115 120 125 

Asn Cys Thr Asp Leu Arg Asn Thr Thr Asn Thr Asn Asn Ser Thr Ala 
130 135 140 

Asn Asn Asn Ser Asn Ser Glu Gly Thr He Lys Gly Gly Glu Met Lys 
145 150 155 160 

Asn Cys Ser Phe Asn He Thr Thr Ser He Arg Asp Lys Met Gin Lys 
165 170 175 

Glu Tyr Ala Leu Leu Tyr Lys Leu Asp He Val Ser He Asp Asn Asp 
180 185 190 

Ser Thr Ser Tyr Arg Leu He Ser Cys Asn Thr Ser Val He Thr Gin 
195 200 205 

Ala Cys Pro Lys He Ser Phe Glu pro He Pro He His Tyr Cys Ala 
210 215 220 

Pro Ala Gly Phe Ala He Leu Lys Cys Asn Asp Lys Lys Phe Ser Gly 

225 230 235 240 

Lys Gly Ser Cys Lys Asn Val Ser Thr Val Gin Cys Thr His Gly He 
245 250 255 

Arg Pro Val Val Ser Thr Gin Leu Leu Leu Asn Gly Ser Leu Ala Glu 

260 265 270 

Glu Glu Val Val He Arg .Ser Glu Aen Phe Thr Asp Asn Ala Lys Thr 
275 280 285 

He He Val His Leu Asn Glu Ser Val Gin He Asn Cys Thr Arg Pro 
290 295 300 

Asn Tyr Asn Lys Arg Lys Arg He His He Gly Pro Gly Arg Ala Phe 
305 310 315 320 

Tyr Thr Thr Lys Asn He He Gly Thr He Arg Gin Ala His Cys Asn 
325 330 335 

He Ser Arg Ala Lys Trp Asn Asp Thr Leu Arg Gin He Val Ser Lys 
340 345 350 

Leu Lys Glu Gin Phe Lys Asn Lys Thr He Val Phe Asn Gin Ser Ser 
355 360 365 
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Gly Gly Asp Pro Glu lie Val Net Hie Ser Phe Aen Cys Gly Gly Glu 
370 375 380 

Phe Phe Tyr Cys Asn Thr Ser Pro Leu Phe Asn Ser Thr Trp Asn Gly 
385 390 395 400 

Asn Asn Thr Trp Asn Asn Thr Thr Gly Ser Asn Asn Asn lie Thr Leu 
405 410 415 

Gin Cys Lys He Lys Gin He He Asn Met Trp Gin Glu Val Gly Lys 
420 425 430 

Ala Met Tyr Ala Pro Pro He Glu Gly Gin He Arg Cys Ser Ser Asn 
435 .440 445 

He Thr Gly Leu Leu Leu Thr Arg Asp- Gly Gly Lys Asp Thr Asp Thr 
450 455> 460 

Asn Asp Thr Glu He Phe Arg Pro Gly Gly Gly Asp Met Arg Asp Asn 
465 470 475 480 

Trp Arg Ser Glu Leu Tyr Lys Tyr Lys Val Val Thr He Glu Pro Leu 
485 490 495 

Gly Val Ala Pro Thr Lys Ala Lys Arg Arg Val Val Gin Arg Glu Lys 

500 505 510 

Arg Ala Ala He Gly Ala Leu Phe Leu Gly Phe Leu Gly Ala Ala Gly 
515 520 525 

Ser Thr Met Gly Ala Ala Ser Val Thr Leu Thr Val Gin Ala Arg Leu 
530 535 540 

Leu Leu Ser Gly He Val Gin Gin Gin Asn Asn Leu Leu Arg Ala He 
545 550 555 560 

Glu Ala Gin Gin His Met Leu Gin Leu Thr Val Trp Gly He Lys Gin 

565 570 575 

Leu Gin Ala Arg Val Leu Ala Val Glu Arg Tyr Leu Lys Asp Gin Gin 
580 585 590 

Leu Leu Gly Phe Trp Gly Cys Ser Gly Lys Leu He Cys Thr Thr Thr 
595 600 605 

Val Pro Trp Asn Ala ser Trp Ser Asn Lys Ser Leu Asp Asp He Trp 
610 615 620 

Asn Asn Met Thr Trp Met Gin Trp Glu Arg Glu He Asp Asn Tyr Thr 

625 630 635 640 

Ser Leu He Tyr Ser Leu Leu Glu Lys Ser Gin Thr Gin Gin Glu Lys 
645 650 655 

Asn Glu Gin Glu Leu Leu Glu Leu Asp Lys Trp Ala Ser Leu Trp Asn 
660 665 670 

Trp Phe Asp He Thr Asn Trp Leu Trp Tyr He Lys He Phe He Met 
675 680 685 

He Val Gly Gly Leu Val Gly Leu Arg He Val Phe Ala Val Leu Ser 
690 695 700 

He Val Asn Arg Val Arg Gin Gly Tyr Ser Pro Leu Ser Leu Gin Thr 
705 710 715 720 
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Arg Pro Fro Val Pro Arg 6Iy Pro Asp Arg Pro Glu 
725 730 



6ly lie Glu Glu 
735 



Glu Gly Gly Glu Arg Asp Arg Asp Thr Ser Gly Arg 
740 745 



Leu Val His Gly 
750 



Phe Leu Ala He He Trp Val Asp Leu Arg Ser Leu 
755 760 



Phe Leu Phe Ser 
765 



Tyr His His Arg Asp Leu Leu Leu He Ala Ala Arg He Val Glu Leu 
770 775 780 



Leu Gly Arg Arg Gly Trp Glu Val Leu Lys Tyr Trp 
785 790 795 



Trp Asm Leu Leu 
800 



Gin Tyr Trp Ser Gin Glu Leu Lys Ser Ser Ala Val 
805 810 



Ser Leu Leu Asn 
815 



Ala Thr Ala He Ala Val Ala Glu Gly Thr Asp Arg 
820 825 



Val He Glu Val 
830 



Leu Gin Arg Ala Gly Arg Ala He Leu His He Pro 
835 840 



Thr Arg He Arg 
845 



Gin Gly Leu Glu Arg Ala Leu Leu 
850 855 
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TG6AT0GGTT AATTTACTCC GAAMAGAGA A6AGATCCTT GATCT6T6G6 TCTACCACAC 60 
ACAA66CTAC TTCCCT6ATT 66CA6AACTA CACACCA6G6 CCAGGGATCA 6ATATCCACT 120. 
6ACCTTTC6A TG6TGCTTGA A6CTAGTACC AGTT6A6CCA GA6AA6ATA6 AA6A6GCCAA 180 
TAAAGGA6AG AACAACTGCT TGTTACACCC TATGAGCCAG CATGGGATGG ATGACC06GA 240 
GAGAGAAGTG TTAGTGTGGA AGTCTGACAG CCACCTAGCA TTTCAGCATT ATGCCC6AGA 300 
GCTGCATCOG GAGTACTACA AGAACT6CT6 ACATCGAGCT ATCTAGSAAGG GACTTTCCGC 360 
TCGGGACTTT CGAGGGAGGT GT6GCCTGGG CGGGACCGGG GAGTCGCGAG CCCTCA6AT6 420 
CTGCATATAA GCAGCTGCTT TCTGCCTGTA CT6G6TCTCT CTGGTTAGAC GAGATCTGAG 480 
CCTGGGAGCT CTCTGGCTAA CTAGGGAACC CACTGCTTAA GCCTCAATAA AGCTTGCCTT 540 
GAGTGCTTCA 'AGTAGTGTGT GCCGGTCTGT TATGTGACTC TGGTA6CTAG A6ATCCCTCA 600 
6ATCCTTTTA GGCAGTGTGG AAAATCTCTA GCA6TGGCGC CCGAACAGGG ACTTGAAAGC 660 
GAAAGAGAAA CGAGA6GAGC TCTCTCGACG CA6GACTCGG CTTGCTGAAG CGCGCACGGC 720 
AAGAGGCGAG GGGCGGCGAC TGGTGAGTAG GCCAAAATTC . TTGACTAGCG GAGGCTAGAA 780 
GGAGAGAGAT GGGTGOGAGA GCGTCGGTAT TAAGCGGGGG AGAATTAGAT CGATGGGAAA 840 
AAATTCGGTT AAGGCCAGGG GGAAAGAAAA AATATAAATT AAAACATGTA GTATGGGCAA 900 
GCAGGGAGCT AGAACGATTC GCAGTCAATC CT6G0CTGTT AGAAACATCA GAAGGCTGTA 960 
GACAAATACT GGGAGAGCTA CAACCATCCC TTCAGACAGG ATCAGAAGAA CTTAAATCAT 1020 
TATATAATAC AGTAGCAACC CTCTATTGTG TGCATCAAAA GATAGAGATA AAAGACACCA 1080 
AG6AAGCTTT AGAGAAAATA GAGGAAGAGC AAAACAAAAG TAAGAAAAAA GCACAGCAAG 1140 
CAGTAGCTGA CACAGGAAAC AGAGGAAACA GCAGCCAAGT CAGCCAAAAT TACCCCATAG 1200 
TGCAGAACAT CCAGGGGCAA ATGGTACATC AGGCCATATC ACCTAGAACT TTAAATGCAT 1260 
GGGTAAAAGT AGTAGAAGAG AAGGCTTTGA GCCCAGAAGT AATACCCATG TTTTCAGCAT 1320 
TATCAGAAGG AGOCACCCCA CAAGATTTAA ACACCATGCT AAACACAGTG GGGGGACATC 1380 
AAGCAGCCAT GCAAATGTTA AAAGACACCA TCAATGAG6A AGCTGCAGAA TGGGATAGAT 1440 
TGCATCCAGT GCATGCA6GG CCTATTGCAC CAGGCCAGAT GAGAGAACCA AGGGGAAGTG 1500 
ACATAGCAGG AACTACTAGT ACCCTTCAGG AACAAATAGG ATGGATGACA AATAATCCAC 1560 
CTATCCCAGT AG6AGAAATC TATAAAAGAT GGATAATCCT GGGATTAAAT AAAATAGTAA 1620 
GQAT6TATAG CCCTTCCAGC ATTCTGGAGA TAAGACAAGG ACCAAAGGAA CCCTTTAGAG 1680 
ACTATGTA6A COGGTTCTAT AAAACTCTAA GAGCOGAGCA AGCTTCACAG 6AGGTAAAAA 1740 
ATTGGATGAC AGAAACCTTG TTGGTCCAAA ATGCGAACCC AGATTGTAAG ACTATTTTAA 1800 
AAGCATTGGG ACCAGCAGCT ACACTA6AAG AAATGATGAC AGCATGTCAG GGAGTGGGAG 1860 
GACCTGGTCA TAAAGCAAGA GTTTTGGCGG AAGC6ATGAG CCAAGTAACA AATTCAGCTA 1920 
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CCATAATGAT GCAGAGAGGC AATTTTA6GA ATCAAAGAAA GATTATCAAG TGCTTCAATT 1980 
GTGGCAAAGA AGGGCACATA GCCAAAAATT GCA6GGCCCC TAG6AAAAGG GGCTGTTGGA 2040 
AAT6TGGAAA GGAAGGACAC CAAATGAAAG ATTGTACTGA GAGACAG6CT AATTTTTTA6 2100 
GGAAGATCTG GCCTTCCTGC AAGGGAAGGC AGGGAATTTT CCTCAGAGCA GAACAGA6CC 2160 
AACAGCCCCA CCAGAAGA6A GCTTCAGGTT T6GGGAAGAG ACAACAACTC CCTATCAGAA 2220 
GCAGGAGAAG AAGCAGGAGA CGATAGACAA GGACCTGTAT CCTTTAGCTT CCCTCAAATC 2280 
ACTCTTTGGC AACGACCCAT TGTCACAATA AAGATAGGGG GGCAACTAAA GGAAGCTCTA 2340 
TTAGATACAG GAGCAGATGA TACAGTATTA GAAGAAATGA ATTTGCCAGG AAGATGGAAA 2400 
CCAAAAATGA TAGGGGGAAT TGGAGGTTTT ATCAAAGTAA GACAGTATGA TCAGATAACC 2460 
ATAGAAATCT GTGGACATAA AGCTATAGGT ACAGTATTAG TAGGACCTAC ACCTGTCAAC 2520 
ATAATTGGAA GAAATCTGTT GACTCAGCTT GGGTGCACTT TAAATTTTCC CATTAGTCCT 2580 
ATTGAAACTG TACCAGTAAA ATTAAAGCCA GGAATGGATG GCCCAAAAGT TAAACAATGG 2640 
CCATTGACAG AAGAAAAAAT AAAAGCATTA ATAGAAATTT GTACAGAAAT GGAAAAGGAA 2700 
GGGAAAATTT CAAAAATTGG GCCTGAAAAT CCATACAATA CTCCAGTATT TGCCATAAAG 2760 
AAAAAAGACA GTACTAAATG GAGAAAATTA GTAGATTTCA GAGAACTTAA TAAGAAAACT 2820 
CAA6ACTTCT 6GGAAGTTCA ATTAGGAATA CCACATCCT6 CAGG6TTAAA AAAGAAAAAA 2880 
TGAGTAACAG TACTGGATGT GGGTGAT6CA TATTTTTCAG TTCCCTTAGA TAAAGACTTC 2940 
AGGAA6TATA CTGCATTTAC CATACCTAGT ATAAACAATG AAACACCAGG 6ATTAGATAT 3000 
GAGTACAATG TGCTTCCACA GGGATGGAAA GGATCACCAG CAATATTCCA AAGTAGCATG 3060 
ACAAAAATCT XAGAGCCTTT TAGAAAACAA AATCCAGACA TAGTTATCTA TCAATACATG 3120 
GATGATTTGT ATGTAGGATC TGACTTAGAA ATAGGGCAGC ATAGAGCAAA AATAGAGGAA 3180 
CT6AGA0GAC ATCTGTTGAG GTGGG6ATTT ACCACACCAG ACAAAAAACA TCAGAAAGAA 3240 
CCTCCATTCC TTTGGATGG6 TTATGAACTC CATCCTGATA AATGGACAGT ACAGCCTATA 3300 
GT6CT6CGAG AAAAAGACA6 CTGGACTGTC AATGACATAC AGAAGTTA6T GGGAAAATTG 3360 
AATTGG6CAA GTGAAATTTA CGCA66GATT AAAGTAAAGC AATTATGTAA ACTCCTTAGA 3420 
GGAACCAAAG GACTAACA6A AGTAATACCA CTAACAGAAG AAGCAGAGCT AGAACTGGCA 3480 
6AAAAGAGGG AAATTCTAAA AGAACCAGTA CATGGAGTGT ATTATGACCC ATCAAAAGAC 3540 
TTAATAGCAG AAGTACAGAA GCAGGGGGAA GGCCAAT6GA CATATCAAAT TTATCAAGAG 3600 
CCATTTAAAA ATCTGAAAAC AGGCAAATAT GCAAGAATGA GGGGTGCCCA CACTAATGAT 3660 
GTAAAACAAT TAAGAGAGGC AGTGCAAAAA ATAGCCACAG AAAGCATAGT AATATGGGGA 3720 
AAGACTCCTA AATTTAGACT ACCGATACAA AAAGAAACAT GGGAAACATG GTGGACAGAG 3780 
TATTOGCAA6 CCACCTGGAT TCCT6AGTGG GAGTTTGTGA ATACCCCTCC CTTAGTGAAA 3840 
TTATGGTACC AGTTAGAGAA AGAACCCATA GTAGGAGCAG AAACTTTCTA TGTAGATGGG 3900 
GCAGCTAACA GGGAGACTAA AAAAGGAAAA GCAGGATATG TTACTAACAG AGGAAGACAA 3960 
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AAG6TT6TCT CCCTAACT6A GAGAACAAAT GA0MGACT6 AGTTACAA6C AATTCATCTA 4020 
GCTTTGCAAG ATTCAOGGTT AGAA6TAAAC ATAGTAAGA6 ACTCACAATA T6CATTA66A 4080 
ATGATTCAA6 GACAACGA6A TAAAAGT6AA TCA6A6TTA6 TCA6TCAAAT AATA6AGGA6 4140 
TTAATAAAAA A66AAAAGGT CTATCT6GCA T6GGTACCA6 CACACAAA6G AATT66AGGA 4200 
AAT6AAGAA6 TAGATAAATT A6TGAGTGCT 6GAATCAGGA AAGTACTATT TTTAGATGGA 4260 
ATAGATAA6G CCCAAGAAGA OCATCAGAAA TATCACA6TA ATT6GAGAGC AATGGCTAGT 4320 
6ACTTTAACC TACCACCTAT A6TAGCAAAA GAAATAGTAG pCAGCTGTGA TAAATGTCAG 4380 
CTAAAAGGAG AAGCGATGCA TGGACAAGTA GACTGTAGTC CA6GAATATG 6CAACTA6AT 4440 
TGTACACATT XAGAA6GAAA AGTTATCCTG GTAGCAGTTC ATGTAGCCAG TG6ATAGATA 4500 
6AAGCAGAAG TTATTCCAGC A6AGACAGGG CA6GA6ACAG CATACTTTCT CTTAAAATTA 4560 
GGAGGAAGAT GGCCA6TAAA AACAATACAT ACAGACAATG GCCCCAATTT CACCAGTACT 4620 
AC6GTTAAGG C06CCTCTTG GTGGGC6GGG ATCAA6CAG6 AATTT6GCAT TCCCTACAAT 4680 
CCCCAAAGTC AAGGAGTAAT A6AATCTATG AATAAAGAAT TAAAGAAAAT TATA6GACAG 4740 
GTAAGAGATC A6GCTGAACA TCTTAAGACA GCAGTACAAA TGGCAGTATT CATCCACAAT 4800 
TTTAAAAGAA AAGGGGGGAT TGGGGGGTAC AGTGCAGGGG AAA6AATAGT AGACATAATA 4860 
GCAACAGACA TACAAACTAA AGAACTACAA AMCAAATTA CAAAAATTCA AAATTTTCGG 4920 
GTTTATTACA GGGACAGCAG AGATCCACTT TGGAAAGGAC CAGCAAAGCT TCTCTGGAAA 4980 
GGTGAAGGGG CAGTAGTAAT ACAAGATAAT AGTGACATAA AAGTAGTGCC AAGAAGAAAA 5040 
GCAAAGATCA TTAGGGATXA TGGAAAACAG AT6GCAGGTG ATGATTGTGT GGCAAGTAGA 5100 
CAGGATGAGG ATTAGAACAT GGAAAAGTTT AGTAAAACAC CATATGTATA TTTCAAAGAA 5160 
AGCTAAAGGA TGGTTTTATA GACATCACTA TGAAAGCACT CATCCAAGAA TAA6TTCAGA 5220 
AGTACACATC CCACTAGGGG ATGCTAGATT GGTAATAACA ACATATTGGG GTCTCCATAC 5280 
AGGAGAAAGA GACTGGCATT TAGGTCAGGG AGTCTCCATA GAATGGA6GA AAAAGAGATA 5340 
TAGCACAGAA GTAGACCCT6 ACCTA6CAGA CCACCTAATT GATCT6CATT ACTTTGATT6 5400 
TTTTTCAGAC TCTGCCATAA GAAAGGCCAT ATTAGGACAT AGAGTTAGTC CTATTTGTGA 5460 
ATXTCAAGCA GGACATAACA AGGTA6GATC TCTACAGTAC TTGGCACTAA CA6CATTAAT 5520 
AACACCAAAA AAGATAAAGC CACCTTTGCC TA6TGTTAAG AAACTGACAG AG6ATAGATG 5580 
GAACAAGCCC CAGAA6ACCA AGGGCCACAG AGGGA6CCAT ACAATGAATG GGCATTAGAG 5640 
CTTTTAGAGG AGCTTAAGAA TGAA6CT6TT AGACATTTTC CTA6GATATG GCTCCATGGC 5700 
TTAGGGCAAC ATATCTATGA AACTTATGGG GATACTTGGG CAG6AGTGGA AGCCATAATA 5760 
AGAATTCTAC AACAACTGCT GTTTATTCAT TTCAGAATTG 6GTGTCGACA TAGCAGAATA 5820 
GGCATTATTC GACAGAGGAG AGCAAGAAAT GGAGCCAGTA GATCCTAGAC TAGAGCCCTG 5880 
GAAGCATCCA GGAAGTCAGC CTAAGACTGC TTGTACCACT TGCTATTGTA AAAAGTGTTG 5940 
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CTTTCATT6C CAA0TTT6TT TCAGAAMAft A60CTTAGGC ATCTCCTAT6 6C2^G6AAGA2V 6000 

6CG0A6ACAG 06AC6AA6A6 CTCCT6AA6A CAGTCA6ACT CATCAA6TTT CTCTACCAAA 6060 

GCAGTAI^GTA GTACATGTAA TGCAACCTTT AGTAATAGCA GCAATAGTAG CATTAGTAGT 6120 

AGCAGGAATA ATAGCAATAG TTGTGTGATC CATAGTATTC ATAGAATATA GGAAAATAAG 6180 

AAGACAAAGA AAAATAGACA GGGTAATTGA GAGAATAAGC GAAAGAGCAG AAGACAGTGG 6240 

CA ATG AGA GTG AAG GGG ATC AGG A6G AAT TAT GAG CAC TGG TOG GGA 6287 
Met Arg Val Lys Gly lie Arg Arg Asn Tyr Gin Hie Trp Trp Gly 
1 5 - 10 15 

. TGG GGC ACG ATG CTC CTT GGG TTA TTA ATG ATC TGT AGT GOT ACA GAA 6335 
Trp Gly Thr Met Leu Leu Gly Leu Leu Met lie Cys Ser Ala Thr Glu 
20 25 30 

AAA TTG TGG GTC ACA GTC TAT TAT GGG GTA CCT GTG TGG AAA GAA GCA 6383 
Lys Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala 
35 40 45 

ACC ACC ACT CTA TTT TGT GCA TCA GAT GCT AAA GCA TAT GAT ACA GAG 6431 
Thr Thr Thr Leu Phe Cys Ala Ser Asp Ala Lye Ala Tyr Asp Thr Glu 
50 55 60 

GTA CAT AAT GTT TGG GCC ACA CAT GCC TGT GTA CCC ACA GAC CCC AAC . 6479 
Val HlB Asn Val Trp Ala Thr His Ala Cys Val Pro Thr Asp Pro Asn 
65 70 75 

CCA CAA GAA GTA GAA TTG GTA AAT GTG ACA GAA AAT TTT AAC ATG TGG 6527 
Pro Gin Glu Val Glu Leu Val Asn Val Thr Glu Asn Phe Asn Met Trp 
80 85 90 95 

AAA AAT AAC ATG GTA GAA CAG ATG CAT GAG GAT ATA ATC AGT TTA TGG 6575 
Lys Asn Asn Met Val Glu Gin Met His Glu Asp lie He Ser Leu Trp 
100 105 110 

GAT CAA AGC CTA AAG CCA TGT GTA AAA TTA ACC CCA CTC TGT GTT ACT 6623 
Asp Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Thr 
115 120 125 

TTA AAT TGC ACT GAT TTG AGG AAT ACT ACT AAT ACC AAT AAT AGT ACT 6671 
Leu Asn Cys Thr Asp Leu Arg Asn Thr Thr Asn Thr Asn Asn Ser Thr 
130 135 140 

GCT AAT AAC AAT AGT AAT AGC GAG GGA ACA ATA AAG GGA GGA GAA ATG 6719 
Ala Asn Asn Asn Ser Asn Ser Glu Gly Thr He Lys Gly Gly Glu Met 
145 150 155 

AAA AAC TGC TCT TTC AAT ATC ACC ACA AGC ATA AGA GAT AAG ATG CAG 6767 
Lys Asn Cys Ser Phe Asn He Thr Thr Ser He Arg Asp Lys Met Gin 
160 165 170 175 

AAA GAA TAT GGA CTT CTT TAT AAA CTT GAT ATA GTA TCA ATA AAT AAT 6815 
Lys Glu Tyr Ala Leu Leu Tyr Lys Leu Asp He Val Ser He Asn Asn 
180 185 190 

GAT AGT ACC AGC TAT AGG TTG ATA AGT TGT AAT ACC TCA GTC ATT ACA 6863 
Asp Ser Thr Ser Tyr Arg Leu He Ser Cys Asn Thr Ser Val He Thr 
195 200 205 

CAA GCT TGT CCA AAG ATA TCC TTT GAG CCA ATT CCC ATA CAC TAT TGT 6911 
Gin Ala Cys Pro Lys He Ser Phe Glu Pro He Pro He His Tyr Cys 
210 215 220 
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6CC CC6 GOT G6T TTT 6CG ATT CTA AAG T6T AAC GAT AAA AAG TTC AGT 6959 
Ala Pro Ala Gly Phe Ala lie Leu Lye Cys Asn Asp Lye Lys Phe Ser 

225 230 235 

GGA AAA GGA TCA TGT AAA AAT GTC AGC ACA GTA CAA TGT ACA CAT G6A 7007 
Gly Lye Gly Ser Cys Lys Asn Val Ser Thr Val Gin Cys Thr His Gly 
240 245 250 255 

ATT AGG CCA GTA GTA TCA ACT CAA CTG CTG TTA AAT GGC AGT CTA GCA 7055 
He Arg Pro Val Val Ser Thr Gin Leu Leu Leu Asn Gly Ser Leu Ala 
260 265 270 

GAA GAA GAG GTA GTA ATT AGA TCT GAG AAT TTC AAT GAT AAT GCT AAA 7103 
Glu Glu Glu Val Val He Arg Ser Glu Asn Phe Asn Asp Asn Ala Lys 
275 280 285 

ACC ATC ATA GTA CAT CTG AAT GAA TCT GTA CAA ATT AAT TGT ACA AGA 7151 
Thr He He Val His Leu Asn Glu Ser Val Gin He Asn Cys Thr Arg 
290 295 30O> 

CCC AAC TAC AAT AAA AGA AAA AGG ATA CAT ATA GGA CCA GG6 AGA GCA 7199 
Pro Asn Tyr Asn Lys Arg Lys Arg He His He Gly Pro Gly Arg Ala 
305 310 315 

TTT TAT ACA ACA AAA AAT ATA ATA GGA ACT ATA AGA CAA GCA CAT TGT 7247 
Phe Tyr Thr Thr Lys Asn He He Gly Thr He Arg Gin Ala His Cys 
320 325 330 335 

AAC ATT AGT AGA GCA AAA T6G AAT GAC ACT TTA AGA CAG ATA GTT AGC 7295 
Asn He Ser Arg Ala Lys Trp Asn Asp Thr Leu Arg Gin He Val Ser 
340 345 350 

AAA TTA AAA GAA CAA TTT AAG AAT AAA ACA ATA GTC TTT AAT CAA TCC 7343 
Lys Leu Lys Glu Gin Phe Lys Asn Lys Thr He Val Phe Asn Gin Ser 
355 360 365 

TCA GGA GGG GAC CCA GAA ATT GTA ATG CAC AGT TTT AAT TGT GGA GGG 7391 
Ser Gly Gly Asp Pro Glu He Val Met His Ser Phe Asn Cys Gly Gly 
370 375 380 

GAA TTT TTC TAC TGT AAT ACA TCA CCA CTG TTT AAT AGT ACT TGG AAT 7439 
Glu Phe Phe Tyr Cys Asn Thr Ser Pro Leu Phe Asn Ser Thr Trp Asn 
385 390 395 

GGT AAT AAT ACT TGG AAT AAT ACT ACA GGG TCA AAT AAC AAT ATC ACA 7487 
Gly Asn Asn Thr Trp Asn Asn Thr Thr Gly Ser Asn Asn Asn He Thr 
400 405 410 415 

CTT CAA TGC AAA ATA AAA CAA ATT ATA AAC ATG TGG CAG GAA GTA GGA 7535 
Leu Gin Cys Lys He Lys Gin He He Asn Met Trp Gin Glu Val Gly 
420 425 430 

AAA GCA ATA TAT GCC CCT CCC ATT GAA GGA CAA ATT AGA TGT TCA TCA 7583 
Lys Ala He Tyr Ala Pro Pro He Glu Gly Gin He Arg Cys Ser Ser 
435 440 445 

AAT ATT ACA GGG CTA CTA TTA ACA AGA GAT GGT GGT AAG GAC ACG GAC 7631 
Asn He Thr Gly Leu Leu Leu Thr Arg Asp Gly Gly Lys Asp Thr Asp 
450 455 460 

ACG AAC GAC ACC GAG ATC TTC AGA CCT GGA GGA GGA GAT ATG AGG GAC 7679 
Thr Asn Asp Thr Glu He Phe Arg Pro Gly Gly Gly Asp Met Arg Asp 
465 470 475 
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lAT T66 AGK A6T 6AA TTA TAT AAA TAT AAA 6TA GTA ACA ATT 6AA CCA 7727 
Asn Trp Arg Ser Glu Leu Tyr Lym Tyr Lys Val Val Thr lie 6lu Pro 
480 485 490 495 



TTA 6GA GTA GGA CCC ACC AAG GCA AA6 AGA AGA GTG GTG CAG AGA GAA 7775 
Leu Gly Val Ala Pro Thr Lya Ala Lys Arg Arg val Val Gin Arg Glu 
500 505 510 

AAA AGA GCA GCG ATA GGA GCT CTG TTC CTT GGG TTC TTA GGA GCA GCA 7823 
Lys Arg Ala Ala lie Gly Ala Leu Phe Leu Gly Phe Leu Gly Ala Ala 

515 520 525 

GGA AGC ACT ATG GGC GCA GOG TCA GTG ACG CTG ACG GTA CAG GCC AGA 7871 
Gly Ser Thr Met Gly Ala Ala Ser Val Thr Leu Thr Val Gin Ala Arg 
530 535 540 

CTA TTA TTG TCT GGT ATA GTG CAA CAG CAG AAC AAT TTG CTG AGG GCC 7919 
Leu Leu Leu Ser Gly lie Val Gin Gin Gin Asn Asn Leu Leu Arg Ala 
545 550 555 

ATT GAG GCG CAA CAG CAT ATG TTG CAA CTC ACA GTC TGG GGC ATC AAG 7967 
Xle Glu Ala Gin Gin His Met Leu Gin Leu Thr Val Trp Gly lie Lys 
560 565 570 575 

CAG CTC CAG GCA AGA ATC CTG GCT GTG GAA AGA TAC CTA AAG GAT CAA 8015 
Gin Leu Gin Ala Arg lie Leu Ala Val Glu Arg Tyr Leu Lys Asp Gin 
580 585 590 



CAG CTC CTG GGG ATT TGG GGT TGC TCT GGA AAA CTC ATT TGC ACC ACT 8063 
Gin Leu Leu Gly Xle Trp Gly Cys Ser Gly Lys Leu lie Cys Thr Thr 
595 600 605 

ACT GTG OCT TGG AAT GCT AGT TGG AGT AAT AAA TCT CTG GAT GAT ATT 8111 
Thr Val Pro Trp Asn Ala Ser Trp Ser Asn Lys ser Leu Asp Asp lie 
610 615 620 

TGG AAT AAC ATG ACC TGG ATG CAG TGG GAA AGA GAA ATT GAC AAT TAC 8159 
Trp Asn Asn Met Thr Trp Met Gin Trp Glu Arg Glu lie Asp Asn Tyr 
625 630 635 

AGA AGC TTA ATA TAC TCA TTA CTA GAA AAA TCG CAA ACC CAA CAA GAA 8207 
Thr Ser Leu lie Tyr Ser Leu Leu Glu Lys Ser Gin Thr Gin Gin Glu 
640 645 650 655 

ATG AAT GAA CAA GAA TTA TTG GAA TTG GAT AAA TGG GCA AGT TTG TGG 8255 
Met Asn Glu Gin Glu Leu Leu Glu Leu Asp Lys Trp Ala Ser Leu Trp 
660 665 670 

AAT TGG TTT GAC ATA ACA AAT TGG CTG TGG TAT ATA AAA ATA TTC ATA 8303 
Asn Trp Phe Asp lie Thr Asn Trp Leu Trp Tyr lie Lys lie Phe lie 
675 680 685 

ATG ATA GTA GGA GGC TTG GTA GGT TTA AGA ATA GTT TTT GCT GTA CTT 8351 
Met He Val Gly Gly Leu Val Gly Leu Arg He Val Phe Ala Val Leu 
690 695 700 

TCT ATA GTG AAT AGA GTT AGG CAG GGA TAC TCA CCA TTG TCG TTG CAG 8399 
Ser lie Val Asn Arg Val Arg Gin Gly Tyr Ser Pro Leu Ser Leu Gin 
705 710 715 



ACC CGC CCC CCA GTT CCG AGG GGA CCC GAC AGG CCC GAA GGA ATC GAA 8447 
Thr Arg Pro Pro Val Pro Arg Gly Pro Asp Arg Pro Glu Gly He Glu 
720 725 730 735 
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6AA 6A2V G6T G6A GAG AGA 6AC AGA GAC ACA TCC GGT CGA TTA GTG CAT B495 
61u 61u Gly Gly Glu Arg Asp Arg Aap Thr Ser Gly Arg teu Val His 
740 745 750 

GGA TTC TTA GCA ATT ATC TGG GTC GAC CTG CGG A6C CTG TTC CTC TTC 8543 
Gly Phe Leu Ala lie lie Trp Val Asp Leu Arg Ser Leu Phe Leu Phe 
755 760 765 

AGC TAC CAC CAC TTG AGA GAC TTA CTC TTG ATT GCA GC6 AGG ATT GTG 8591 
Ser Tyr His His Leu Arg Asp Leu Leu Leu lie Ala Ala Arg lie Val 
770 775 780 

GAA CTT CTG GGA CGC AGG GGG TGG GAA GTC CTC AAA TAT TGG TGG AAT 8639 
Glu Leu Leu Gly Arg Arg Gly Trp Glu Val Leu Lys Tyr Trp Trp Asn 
785 790 795 

CTC CTA CAG TAT TGG AGT CAG GAA CTA AAG AGT AGT GCT GTT AGC TTG 8687 
Leu Leu Gin Tyr Trp Ser Gin Glu Leu Lys Ser Ser Ala Val Ser Leu 
800 805 810 815 

CTT AAT GCC ACA GAT ATA GCA 6TA GCT GAG GGG ACA GAT AGG GTT ATA 8735 
Leu Asn Ala Thr Asp lie Ala Val Ala Glu Gly Thr Asp Arg Val lie 
820 825 830 

GAA GTA CTG CAA AGA GCT GGT AGA GCT ATT CTC CAC ATA CCT ACA AGA 8783 
Glu Val Leu Gin Arg Ala Gly Arg Ala He Leu His He Pro Thr Arg 
835 840 845 

ATA AGA CAG GGC TTG GAA AGG GCT TTG CTA TAAGATGGGT GGCAAATGGT . 8833 
He Arg Gin Gly Leu Glu Arg Ala Leu Leu 
850 855 

CAAAACGTGT GACTGGATGG CCTACTGTAA GGGAAAAAAT 6AGACGAGCT GAACCAGCTG 8893 

AGOCAGCAGC AGATGGGGT6 GGAGCAGCAT CCC6AGACCT GGAAAAACAT GGAGCACTCA 8953 

CAAGTAGCAA TACAGCAGCT ACCAATGCTG ATTGTGCCTG GCTAGAAGCA CAAGAGGAGG 9013 

AGGAAGTGGG TTTTCCAGTC AGACCTCA6G TACCTTTAAG ACCAATGACT TACAAAGCAG 9073 

CTTTAGATCT TAGCCACTTT TTAAAAGAAA AGGGGGGACT GGATGGGTTA ATTTACTCCC 9133 

AAAAGAGACA AGACATCCTT GATCTGTGGG TCTACCACAC ACAAGGCTAC TTCCCTGATT 9193 

GGCAGAACTA CACACCAGGG CCAGGGATCA GATATCCACT GACCTTTGGA TGGTGCTTCA 9253 

AGCTAGTACC AGTTGAGCCA GAGAAGATAG AAGAGGCCAA TAAAGGAGAG AACAACTGCT 9313 

TGTTACACCC TAT6AGCCAG CATGGGATGG ATGACCC6GA GAGAGAAGTG TTA6TGTGGA 9373 

AGTCTGACAG CCACCTAGCA TTTCAGCATT ATGCCC6AGA GCTGCATCCG GAGTACTACA 9433 

AGAACT6CTG ACATCGAGCT ATCTACAAGG GACTTTCOGC TGGGGACTTT CCAGGGAGGT 9493 

GTGGCCTGGG CGGGACCGGG GAGTGGCGAG CCCTCAGATG CTGCATATAA GCAGCTGCTT 9553 

TCTGCCTGTA CTGGGTCTCT CTGGTTAGAC CAGATCTGAG CCTGGGAGCT CTCTGGCTAA 9613 

CTAGGGAACC CACTGCTTAA GCCTCAATAA AGCTTGCCTT GA6TGCTTCA AGTAGTGTGT 9673 

GCCCGTCTGT TATGTGACTC TGGTAGCTAG AGATCCCTCA GATCCTTTTA GGCAGTGTGG 9733 

AAAATCTCTA GCA 9746 



Met Arg Val Lys Gly He Arg Arg Asn Tyr Gin Bis Trp Trp Gly Trp 
15 10 15 
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61y Thr Met I^u Leu 61y Leu Leu Met lie Cye Ser Ala Thr Glu Lys 
20 25 30 

Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala Thr 
35 40 45 

Thr Thr Leu Phe Cys Ala Ser Asp Ala Lys Ala Tyr Asp Thr Glu Val 
50 55 60 

His Asn Val Trp Ala Thr His Ala Cys Val Pro Thr Asp Pro Asn Pro 
65 70 75 80 

Gin Glu Val Glu Leu Val Asn Val Thr Glu Asn Phe Asn Met Trp Lys 
B5 90 95 

Asn Asn Met Val Glu Gin Met His Glu Asp lie He Ser Leu Trp Asp 
100 105 110 

Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Thr Leu 
115 120 125 

Asn Cys Thr Asp Leu Arg Asn Thr Thr Asn Thr Asn Asn Ser Thr Ala 
130 135 140 

Asn Asn Asn Ser Asn Ser Glu Gly Thr He Lys Gly Gly Glu Met Lys 
145 150 155 160 

Asn Cys Ser Phe Asn He Thr Thr Ser He Arg Asp Lys Met Gin Lys 
165 170 175 

Glu Tyr Ala Leu Leu Tyr Lys Leu Asp He Val Ser He Asn Asn Asp 
180 185 190 

Ser Thr Ser Tyr Arg Leu He Ser C^s Asn Thr Ser Val He Thr Gin 
195 200 205 

Ala Cys Pro Lys He Ser Phe Glu Pro He Pro He His Tyr Cys Ala 
210 215 220 

Pro Ala Gly Phe Ala He Leu Lys Cys Asn Asp Lys Lys Phe Ser Gly 
225 230 235 240 

Lys Gly Ser Cys Lys Asn Val Ser Thr Val Gin Cys Thr His Gly He 
245 250 255 

Arg Pro Vai Val Ser Thr Gin Leu Leu lieu Asn Gly Ser Leu Ala Glu 
260 265 270 

Glu Glu Val Val He Arg Ser Glu Asn Phe Asn Asp Asn Ala Lys Thr 

275 ' 280 285 

He He Val His Leu Asn Glu Ser Val Gin He Asn Cys Thr Arg Pro 
290 295 300 

Asn Tyr Asn Lys Arg Lys Arg He His He Gly Pro Gly Arg Ala Phe 

305 310 315 320 

Tyr Thr Thr Lys Asn He He Gly Thr He Arg Gin Ala His Cys Asn 
325 330 335 

He Ser Arg Ala Lys Trp Asn Asp Thr Leu Arg Gin He Val Ser Lys 
340 345 350 

Leu Lys Glu Gin Phe Lys Asn Lys Thr He Val Phe Asn Gin Ser Ser 
355 360 365 
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61y Gly Asp Pro Glu lie Val Met His 8er Phe Asn Cya Gly Gly Glu 
370 375 380 

Phe Phe Tyr Cys Asn Thr Ser Pro Leu Phe Asn Ser Thr Trp Asn Gly 
385 390 395 400 

Asn Asn Thr Trp Asn Asn Thr Thr Gly Ser Asn Asn Asn He Thr Leu 
405 410 415 

Gin Cys Lys He Lys Gin He He Asn Met Trp Gin Glu Val Gly Lys 
420 425 430 

Ala He Tyr Ala Pro Pro He Glu Gly Gin He Arg Cys Ser Ser Asn 
435 440 445 

He Thr Gly Leu Leu Leu Thr Arg Asp Gly Gly Lys Asp Thr Asp Thr 
450 455 460 

Asn Asp Thr Glu He Phe Arg Pro Gly Gly Gly Asp Met Arg Asp Asn 
465 470 475 480 

Trp Arg Ser Glu Leu Tyr Lys Tyr Lys Val Val Thr He Glu Pro Leu. 

485 490 495 

Gly Val Ala Pro Thr Lys Ala Lys Arg Arg Val Val Gin Arg Glu Lys 
500 505 510 

Arg Ala Ala He Gly Ala Leu Phe Leu Gly Phe Leu Gly Ala Ala Gly 
515 520 525 

Ser Thr Met Gly Ala Ala Ser Val Thr Leu Thr Val Gin Ala Arg. Leu 
530 535 540 

Leu Leu Ser Gly He Val Gin Gin Gin Asn Asn Leu Leu Arg Ala He 

545 550 555 560 

Glu Ala Gin Gin His Met Leu Gin Leu Thr Val Trp Gly He Lys Gin 
565 570 575 

Leu Gin Ala Arg He Leu Ala Val Glu Arg Tyr Leu Lys Asp Gin Gin 
580 585 590 

Leu Leu Gly He Trp Gly Cys Ser Gly Lys Leu He Cys Thr Thr Thr 
595 600 605 

Val Pro Trp Asn Ala Ser Trp Ser Asn Lys Ser Leu Asp Asp He Trp 

610 615 620 

Asn Asn Met Thr Trp Met Gin Trp Glu Arg Glu He Asp Asn Tyr Thr 
625 630 635 640 

Ser Leu He Tyr Ser Leu Leu Glu Lys Ser Gin Thr Gin Gin Glu Met 
645 650 655 

Asn Glu Gin Glu Leu Leu Glu Leu Asp Lys Trp Ala Ser Leu Trp Asn 
660 665 670 

Trp Phe Asp He Thr Asn Trp Leu Trp Tyr He Lys He Phe He Met 
675 680 685 

He Val Gly Gly Ley Val Gly Leu Arg He Val Phe Ala Val Leu Ser 
690 695 700 

He Val Asn Arg Val Arg Gin Gly Tyr Ser Pro Leu Ser Leu Gin Thr 
705 710 715 720 
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Arg 



Pro 



Pro 



Val 



Pro 

725 



Arg 



Gly 



Pro 



Asp Arg 
730 



Pro Glu Gly He Glu 
735 



Glu 



Glu Gly Gly Glu Arg Asp Arg Asp Thr Ser Gly Arg Leu Val Bis Gly 
740 745 750 

Phe Leu Ala lie He Trp Val Asp Leu Arg Ser Leu Phe Leu Phe Ser 
755 760 765 

Tyr His His Leu Arg Asp Leu Leu Leu Zle Ala Ala Arg He Val Glu 

770 775 780 

Leu Leu Gly Arg Arg Gly Trp Glu Val Leu Lys Tyr Trp Trp Asn Leu 
785 790 795 800 

Leu Gin Tyr trp Ser Gin Glu Leu Lys Ser Ser Ala Val Ser Leu Leu 
805 810 815 

Asn Ala Thr Asp He Ala Val Ala Glu Gly Thr Asp Arg Val He Glu 
820 825 830 

Val Leu Gin Arg Ala Gly Arg Ala He Leu His He Pro Thr Arg He 
835 840 845 

Arg Gin Gly Leu Glu Arg Ala Leu Leu 
850 855 
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TI^LE III 

6ATCAA666C CACAGAGGGll GCXSIGAGAAT 6AATG6ACAC TA6AGCTTTT A6A66A6CTT 60 

AA6A6TGAA6 CT6TTA6ACA CTTTCCTAGG ATATGGCTTC ATG6CTTAGG GCAACATATC 120 

TATGAAACTT ATGGGGATAC TTGGGCAGGA GTGGAAGCCA TAATAAGAAT TCTGCAACAA 180 

CTGCIGTTTA TCGATTTCAG 6ATTGGGT6C CAACATA6CA GAATAGGTAT TATTCAACAG 240 

AGGAGAGGAA GAAATGGAGC CAGTAGATCC TAAACTAGAG CCCTG6AA6C ATCCAGGAAG 300 

TCAGCCTAAG ACTGCTTGTA CCACTTGCTA TTGTAAAAAG TGTTGCTTTC ATTGCCAAGT 360 

TTGCTTCATA ACAAAAGGCT TAGGCATCTC CTATGGCAGG AAGAAGCGGA GACAGCGACG 420 

AAGAGCTTCT CAAGACAGTG AGACTCATCA AGTTTCTCTA TCAAAGCA6T AAGTAGTACA 480 

T6TAATGCAA GCTTTACAAA TATCAGCTAT AGTAGGATTA GTAGTAGCAG CAATAATAGC 540 

AATA6TTGTG T6GACCATAG TATTCATAGA ATATAGGAAA ATATTAAGGC AAAGAAAAAT 600 

AGACAGGTTA ATTGATAGAA TAACAGAAA6 AGCAGAAGAC AGTGGCA ATG AGA GTG 656 

Met Arg Val 
1 

ACG GAG ATC AGG AAG AGT TAT GAG CAC TGG TGG AGA TGG GGC ATC ATG 704 
Thr Olu He Arg LyB Ser Tyr Gin His Trp Trp Arg Trp Gly He Met 
5 10 15 



CTC CTT GGG ATA TTA ATG ATC TGT AAT GCT GAA GAA AAA TTG TGG GTC 752 
Leu Leu Gly He Leu Met He Cys Asn Ala Glu Glu Lys Leu Trp Val 
20 25 30 35 

ACA GTC TAT TAT GGG GTA CCT GTG TGG AAA GAA GCA ACC ACC ACT CTA 800 
Thr Val Tyr Tyr Gly Val Pro Val Trp Lys Glu Ala Thr Thr Thr Leu 
40 45 50 



TTT TGT GCA TCA GAT CGT AAA GCA TAT GAT ACA GAG GTA CAT AAT GTT 848 
Phe Cys Ala Ser Asp Arg Lys Ala Tyr Asp Thr Glu Val His Asn Val 
55 60 65 

TGG GCC ACA CAT GCC TGT GTA CCC ACA GAC CCC AAC CCA CAA GAA GTA 896 
Trp Ala Thr His Ala Cys Val Pro Thr Asp Pro Asn Pro Gin Glu Val 
70 75 80 

GAA TTG AAA AAT GTG ACA GAA AAT TTT AAC ATG TGG AAA AAT AAC ATG 944 
Glu Leu Lys Asn Val Thr Glu Asn Phe Asn Met Trp Lys Asn Asn Met 
85 90 95 

GTA GAA CAA ATG CAT GAG GAT ATA ATC AGT TTA TGG GAT CAA AGC CTA 992 
Val Glu Gin Met His Glu Asp He He Ser Leu Trp Asp Gin Ser Leu 
100 105 110 115 

AAG CCA TGT GTA AAA TTA ACC CCA CTC TGT GTT ACT TTA AAT TGC ACT 1040 
Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Thr Leu Asn Cys Thr 
120 125 130 

GAT TTG AGG AAT GCT ACT AAT GGG AAT GAC ACT AAT ACC ACT AGT AGT 1088 
Asp Leu Arg Asn Ala Thr Asn Gly Asn Asp Thr Asn Thr Thr Ser Ser 
135 140 145 
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A6C AGG GGA AT6 6T6 666 66A G6A 6AA AT6 AAA AAT T6C TCT TTC AAT . 1136 
Ser Arg 61y Met Val 6ly Gly Gly Glu Met LyB Asn Cys Ser Phe Asn 
150 155 160 

ATC ACC AGA AAC ATA AGA G6T AA6 GT6 CAG AAA 6AA TAT 6CA CTT TTT 1184 
He Thr Thr Asn He Arg Gly Lye Val Gin Lye Glu Tyr Ala Leu Phe 
165 170 175 

TAT AAA CTT GAT ATA GCA CCA ATA GAT AAT AAT AGT AAT AAT AGA TAT 1232 
Tyr Lye Leu Aep He Ala Pro He Aap Aen Aen Ser Aen Aen Arg Tyr 
160 185 190 195 

AGG TTG ATA AGT TGT AAC ACC TCA GTC ATT ACA CAG GCC TGT CCA AAG 1280 
hzg Leu He Ser Cye Aen Thr Ser Val He Thr Gin Ala Cys Pro Lye 
200 205 210 

GTA TCC TTT GAG CCA ATT CCC ATA CAT TAT TGT GCC CCG GCT GGT TTT 1328 
Val Ser Phe Glu Pro He Pro He Hie Tyr Cye Ala Pro Ala Gly Phe 
215 ^ 220 225 

GC6 ATT CTA AAG TGT AAA GAT AAG AAG TTC AAT GGA AAA GGA CCA TGT 1376 
Ala He Leu Lye Cye Lye Aep Lye Lye Phe Aen Gly Lys Gly Pro Cye 
230 235 240 

ACA AAT GTC AGC ACA GTA CAA TGT ACA CAT GGA ATT AGG CCA GTA GTA 1424 
Thr Aen Val Ser Thr Val Gin Cye Thr Hie Gly He Arg Pro Val Val 
245 250 255 

TCA ACT CAA CTG CTG TTA AAT GGC AGT CTA GCA GAA GAA GAG GTA GTA 1472 
Ser Thr Gin Leu Leu Leu Aen Gly Ser Leu Ala Glu Glu Glu Val Val 
260 265 270 275 

ATT AGA TCC GCC AAT TTC GCG GAC AAT GCT AAA GTC ATA ATA GTA CAG 1520 
He Arg Ser Ala Asn Phe Ala Aep Aen Ala Lye Val He He Val Gin 
280 285 290 

CTG AAT GAA TCT GTA GAA ATT AAT TGT ACA AGA CCC AAC AAC AAT ACA 1568 
Leu Asn Glu Ser Val Glu He Aen Cye Thr Arg Pro Aen Aen Asn Thr 
295 300 305 

AGA AAA AGT ATA CAT ATA GGA CCA GGC AGA GCA TTT TAT ACA ACA GGA 1616 
Arg Lye Ser He Hie lie Gly Pro Gly Arg Ala Phe Tyr Thr Thr Gly 
310 315 320 

GAA ATA ATA G6A GAT ATA AGA CAA GCA CAT TGT AAC CTT AGT AGA GCA 1664 
Glu He He Gly Asp He Arg Gin Ala His Cye Aen Leu Ser Arg Ala 
325 330 335 

AAA TGG AAT GAC ACT TTA AAT AAG ATA GTT ATA AAA TTA AGA GAA CAA 1712 
Lye Trp Aen Aep Thr Leu Aen Lye He Val He Lye Leu Arg Glu Gin 
340 345 350 355 

TTT 666 AAT AAA ACA ATA 6TC TTT AAG CAC TCC TCA GGA 666 6AC CCA 1760 
Phe 61y Aen Lye Thr He Val Phe Lye Hie Ser Ser Gly Gly Aep Pro 
360 365 370 

GAA ATT 6T6 ACG CAC A6T TTT AAT TGT 66A 666 6AA TTT TTC TAG T6T 1808 
61u He Val Thr Hie Ser Phe Aen Cys 61y 61y 61u Phe Phe Tyr Cys 
375 380 385 

AAT TCA ACA CAA CT6 TTT AAT A6T ACT TGG AAT GTT ACT GAA GAG TCA 1856 
Asn Ser Thr Gin Leu Phe Aen Ser Thr Trp Aen Val Thr Glu Glu Ser 
390 395 400 
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hUr MC ACT 6TA 6AA AAT AAC ACA ATC ACA CTC CCA T6C A6A ATA AAA 1904 
Asn Aan Thr Val GIu Aen Asn Thr lie Thr Leu Pro Cys Arg He Lys 
405 410 415 

CAA ATT ATA AAC AT6 T66 CAG GAA GTA 66A AGA GCA ATG TAT GCC OCT 1952 
Gin He He Aon Met Trp Gin Glu Val Gly Arg Ala Met Tyr Ala Pro 
420 425 430 435 

CCC ATC AGA GGA CAA ATT AGA TGT TCA TCA AAT ATT ACA GGG CTG CTA. 2000 
Pro He Arg Gly Gin He Arg Cye Ser Ser Asn He Thr Gly Leu Leu 
440 445 450 

TTA ACA AGA GAT GGT GGT CCT GAG GAC AAC AA6 ACC GAG GTC TTC AGA 2048 
Leu Thr Arg Asp Gly Gly Pro Glu Asp Asn Lys Thr Glu Val Phe^Arg 
455 460 465 

CCT GGA GGA GGA GAT ATG AGG GAT AAT TGG AGA A6T GAA TTA TAT AAA 2096 
Pro Gly Gly Gly Asp Met Arg Asp Asn Trp Arg Ser Glu Leu Tyr Lys 
470 475 480 



TAT AAA GTA GTA AAA ATT GAA CCA TTA GGA GTA GCA CCC ACC AA6 GCA 2144 
Tyr Lys Val Val Lys He Glu Pro Leu Gly Val Ala Pro Thr Lys Ala 
485 490 495 

AA6 AGA AGA GTG GTG CAG AGA GAA AAA AGA GCA GT6 GGA ATA GGA GCT 2192 
Lys Arg Arg Val Val Gin Arg Glu Lys Arg Ala Val Gly He Gly Ala 
500 505 510 515 

GTG TTC CTT GGG TTC TTG GGA GCA GCA GGA AGC ACT ATG 6GC GCA GCG 2240 
Val Phe Leu Gly Phe Leu Gly Ala Ala Gly Ser Thr Met Gly Ala Ala 
520 525 530 

GCA ATG ACG CTG ACG GTA CAG GCC AGA CTA TTA TTG TCT GGT ATA GTG 2288 
Ala Met Thr Leu Thr Val Gin Ala Arg Leu Leu Leu Ser Gly He Val 
535 540 545 

CAA CAG CAG AAC AAT CTG CTG AGG GCT ATT GAG GCG CAA CAG CAT CTG 2336 
Gin Gin Gin Asn Asn Leu Leu Arg Ala He Glu Ala Gin Gin His Leu 
550 . 555 560 

TTG CAA CTC ACA GTC TGG GGC ATC AA6 CAG CTC CAG GCA AGA GTC CTG 2384 
Leu Gin Leu Thr Val Trp Gly He Lys Gin Leu Gin Ala Arg Val Leu 
565 570 575 

GCT GTG GAA AGA TAC CTA AGG GAT CAA CAG CTC CTG GGG ATT TGG GGT 2432 
Ala Val Glu Arg Tyr Leu Arg Asp Gin Gin Leu Leu Gly He Trp Gly 
580 585 590 595 

T6C TCT GGA AAA CTC ATC TGC ACC ACT GCT GTG CCT TGG AAT GCT AGT 2480 
Cys Ser Gly Lys Leu He Cys Thr Thr Ala Val Pro Trp Asn Ala Ser 
600 605 610 

TGG AGT AAT AAA TCT CTG AAT AAG ATT TGG GAT AAC ATG ACC TGG ATA 2528 
Trp Ser Asn Lys Ser Leu Asn Lys He Trp Asp Asn Met Thr Trp He 
615 620 625 



GAG TGG GAC AGA GAA ATT AAC AAT TAC ACA AGC ATA ATA TAC AGC TTA 2576 
Glu Trp Asp Arg Glu He Asn Asn Tyr Thr Ser He He Tyr Ser Leu 
630 635 640 

ATT GAA GAA TC6 CAG AAC CAA CAA GAA AAG AAT GAA CAA GAA TTA TTA 2624 
He Glu Glu Ser Gin Asn Gin Gin Glu Lys Asn Glu Gin Glu Leu Leu 
645 650 655 
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6AA TTA GAT AAA TCG 6GA A6T TTQ T66 AAT TG6 TTT GAC ATA ACA AAA 2672 

GXu Leu Asp Lys Trp Ala Ser Leu Trp Asn Trp Phe Asp lie Thr Lye 
660 665 670 675 

TG6 CTG TGG TAT ATA AAA ATA TTC ATA ATG ATA GTA GGA GGC TTG ATA 2720 
Trp Leu Trp Tyr lie Lya lie Phe lie Met He Val Gly Gly Leu He 
680 685 690 

GGT TTA A6A ATA GTT TTT TOT GTA CTT TCT ATA GTG AAT AGA GTT AGG 2768 
Gly Leu Arg He Val Phe Ser Val Leu Ser He Val Asn Arg Val Ara 
695 700 705 

GAG GGA TAG TCA CCA TTA TCG TTT CAG ACC CAC CTC CCA TCC TCG AGG 2816 
Gin Gly Tyr Ser Pro Leu Ser Phe Gin Thr Hie Leu Pro Ser Ser Arg 
710 715 720 

GGA CCC GAC AGG OCC GGA GGA ATC GAA GAA GAA GGT GGA GAG AGA GAC 2864 
Gly Pro Asp Arg Pro Gly Gly He Glu Glu Glu Gly Gly Glu Arg Asp 
725 730 735 

AGA GAC AGA TCC GGT CCA TTA GTG AAC GGA TTC TTG GCG CTT ATC TGG 2912 
Arg Asp Arg Ser Gly Pro Leu Val Asn Gly Phe Leu Ala Leu He Trp 
740 745 750 755 

GTC GAT CTG CGG AGC CTG TTC CTC TTC A6C TAG CAC CGC TTG AGA GAC 2960 
Val Asp Leu Arg Ser Leu Phe Leu Phe Ser Tyr His Arg Leu Arg Asp 
760 765 770 



TTA CTC TTG ATT GTG ATG AGG ATT GTG GAA CTT CTG GGA CTA GCA GGG 3008 
Leu Leu Leu He Val Met Arg He Val Glu Leu Leu Gly Leu Ala Gly 
775 780 785 

GGG TGG GAA GTC CTC AAA TAT TGG TGG AAT CTC CTA CAG TAT TGG AGT 3056 
Gly Trp Glu Val Leu Lys Tyr Trp Trp Asn Leu Leu Gin Tyr Trp Ser 
790 795 800 

CAG GAA CTA AAG AAT AGT GCT GTT AGC TTG CTC AAT GCC ACA GCT GTA 3104 
Gin Glu Leu Lys Asn Ser Ala Val Ser Leu Leu Asn Ala Thr Ala Val 
805 810 815 

GCA GTA GCT GAA GGG ACA GAT AGG GTT ATA GAA GTA TTA CAG AGA GCT 3152 
Ala Val Ala Glu Gly Thr Asp Arg Val He Glu Val Leu Gin Arg Ala 
820 825 830 835 

GTT AGA GCT ATT CTC CAC ATA CCT AGA AGA ATA AGA CAG GGC TTG GAA 3200 
Val Arg Ala He Leu His He Pro Arg Arg He Arg Gin Gly Leu Glu 
840 845 850 

AGG GCT TTG CTA TAAGATGGGT GGGAAGTGGT CAAAAAGTAG TATAGTCGTA 3252 
Arg Ala Leu Leu 
855 

TGGCCTGCTG TAA6GAAAA6 AATGAGAAGA ACTGAGCCAG CAGCAGATGG AGTAGGAGCA 3312 
GTATCTAGAG ACCTGGAAAA ACATGGAGCA ATCACAAGTA GCAATACAGC AGCTAACAAT 3372 
GCTGATTGTG CCTGGCTAGA AGCACAAGAG GATGAAGAAG TGGGTTTTCC AGTCAGACCT 3432 
CAGGTACCTT TAAGACCAAT GACTCGCAGT GCAGCTATAG ATCTTAGCCA CTTTTTTAAG 3492 
AAAAAGGGGG GACTGGAAGG GCTAATTCAC TCCCAAAAAA GACAAGATAT CCTTGATTTG 3552 
TGGGTCTAOC ACACACAAGG CTACTTCCCT GATTGGGAGA ACTACACACC AGGGCCAGGG 3612 
ACCAGATTTC CACTGACCTT TGGATGGTGC TTCAAGCTAG TACCAGTTGA GCCAGAGAAG 3672 
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GTA6M6A66 CCAAT6AAGG A6AGAACAAC TGCITGTCAC ACCCTATGAG CCTGCATGGG 3732 

ATGGAT6ACC C6GA6AAAGA AGTGTTA6CA T6GAAGTTTG ACAGCAGCCT AGCATTCCAT 3792 

CAC6TGGCCC 6A6AA 3807 

Met Arg Vail Thr Glu He Arg Lys Ser Tyr Gin Hie Trp Trp Arg Trp 
15 10 15 

Gly He Met Leu Leu Gly He Leu Het He Cys Asn Ala Glu Glu Lys 

20 25 30 

Leu Trp Val Thr Val Tyr Tyr Gly Val Pro Val Trp Lye Glu Ala Thr 
35 40 45 

Thr Thr Leu Phe Cys Ala Ser Asp Arg Lys Ala Tyr Asp Thr Glu Val 
50 55 60 

Hie Asn Val Trp Ala Thr His Ala Cys Val Pro Thr Asp Pro Asn Pro 
65 70 75 80 

Gin Glu Val Glu Leu Lys Asn Val Thr Glu Asn Phe Asn Met Trp Lys 
85 90 95 

Asn Asn Het Val Glu Gin Het His Glu Asp He He Ser Leu Trp Asp 
100 105 110 

Gin Ser Leu Lys Pro Cys Val Lys Leu Thr Pro Leu Cys Val Thr Leu 

115 120 125 

Asn Cys Thr Asp Leu Arg Asn Ala Thr Asn Gly Asn Asp Thr Asn Thr 
130 135 140 

Thr ser Ser Ser Arg Gly Het Val Gly Gly Gly Glu Het Lys Asn Cys 
145 150 155 160 

Ser Phe Asn He Thr Thr Asn He Arg Gly Lys Val Gin Lys Glu Tyr 
165 170 175 

Ala Zieu Phe Tyr Lys Leu Asp He Ala Pro He Asp Asn Asn Ser Asn 
180 185 190 

Asn Arg Tyr Arg Leu He Ser Cys Asn Thr Ser Val He Thr Gin Ala 
195 200 205 

Cys Pro Lys Val Ser Phe Glu Pro He Pro He His Tyr Cys Ala Pro 
210 215 .220 

Ala Gly Phe Ala He Leu Lys Cys Lys Asp Lys Lys Phe Asn Gly Lys 
225 230 235 240 

Gly Pro Cys Thr Asn Val Ser Thr Val Gin Cys Thr His Gly He Arg 
245 250 255 

Pro Val Val Ser Thr Gin Leu Leu Leu Asn Gly Ser Leu Ala Glu Glu 
260 265 270 

Glu Val Val He Arg Ser Ala Asn Phe Ala Asp Asn Ala Lys Val He 
275 280 285 

He Val Gin Leu Asn Glu Ser Val Glu He Asn Cys Thr Arg Pro Asn 
290 * 295 300 

Asn Asn Thr Arg Lys Ser He His He Gly Pro Gly Arg Ala Phe Tyr 
305 310 315 320 
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Thr Thr 61y 61u He He 61y Aep He Arg Gin Ala Hie Cys Asn Leu 
325 330 335 

Ser Arg Ala Lys Trp Asn Asp Thr Leu Asn Lye He Val He Lye Leu 

340 345 350 

Arg 61u Gin Phe Gly Aen Lye Thr He Val Phe Lys His Ser Ser Gly 
355 360 365 

Gly Aep Pro Glu He Val Thr His Ser Phe Asn Cys Gly Gly Glu Phe 
370 375 380 

Phe Tyr Cys Asn Ser Thr Gin Leu Phe Asn Ser Thr Trp Asn Val Thr 
385 390 395 400 

Glu Glu Ser Asn Asn Thr Val Glu Asn Asn Thr He Thr Leu Fro Cys 
405 410 415 

Arg He Lys Gin He He Asn Met Trp Gin Glu Val Gly Arg Ala Met 
420 425 430 

Tyr Ala Pro Pro He Arg Gly Gin He Arg Cys Ser Ser Asn He Thr 
435 440 445 

Gly lieu Leu Leu Thr Arg Asp Gly Gly Pro Glu Asp Asn Lys Thr Glu 
450 455 460 

Val Phe Arg Pro Gly Gly Gly Asp Met Arg Asp Asn Trp. Arg Ser Glu 
465 470 475 480 

Leu Tyr Lys Tyr Lys Val Val Lys He Glu Pro Leu Gly Val Ala Pro 
485 490 495 

Thr Lys Ala Lys Arg Arg Val Val Gin Arg Glu Lys Arg Ala Val Gly 
500 505 510 

He Gly Ala Val Phe Leu Gly Phe Leu Gly Ala Ala Gly Ser Thr Met 
515 520 525 

Gly Ala Ala Ala Met Thr Leu Thr Val Gin Ala Arg Leu Leu Leu Ser 
530 535 540 

Gly He Val Gin Gin Gin Asn Asn Leu Leu Arg Ala He Glu Ala Gin 
545 550 555 560 

Gin His Leu Leu Gin Leu Thr Val Trp Gly He Lys Gin Leu Gin Ala 
565 570 575 

Arig Val Leu Ala Val Glu Arg Tyr Leu Arg Asp Gin Gin Leu Leu Gly 
580 585 590 

He Trp Gly Cys Ser Gly Lys Leu He Cys Thr Thr Ala Val Pro Trp 
595 600 605 

Asn Ala Ser Trp Ser Asn Lys Ser Leu Asn Lys He Trp Asp Asn Met 
610 615 620 

Thr Trp He Glu Trp Asp Arg Glu He Asn Asn Tyr Thr Ser He He 
625 630 635 640 

Tyr Ser Leu He Glu Glu Ser Gin Asn Gin Gin Glu Lys Asn Glu Gin 
645 650 655 

Glu Leu Leu Glu Leu Asp Lys Trp Ala Ser Leu Trp Asn Trp Phe Asp 
660 665 670 
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He Thr Lye Trp Leu Trp Tyr He Lys He Phe He Met He Val Gly 
675 680 685 

Gly Leu He Gly Leu Arg He Val Phe Ser Val Leu Ser He Val Aen 
690 695 700 

Arg Val Arg Gin Gly Tyr Ser Pro Leu Ser Phe Gin Thr His Leu Pro 
705 710 715 720 

Ser Ser Arg Gly Pro Aap Arg Pro Gly Gly He Glu Glu Glu Gly Gly 
725 730 735 

Glu Arg Aep Arg Aep Arg Ser Gly Pro Leu Val Aen Gly Phe Leu Ala 
740 745 750 

Leu He Trp Val Aep Leu Arg Ser Leu Phe Leu Phe Ser Tyr His Arg 
755 760 765 

Leu Arg Asp Leu Leu Leu He Val Met Arg He Val Glu Leu Leu Gly 

770 775 780 

Leu Ala Gly Gly Trp Glu Val Leu Lys Tyr Trp Trp Asn Leu Leu Gin 
785 790 795 800 

Tyr Trp Ser Gin Glu Leu Lys Asn Ser Ala Val Ser Leu Leu Asn Ala 
805 810 815 

Thr Ala Val Ala Val Ala Glu Gly Thr Asp Arg Val He Glu Val Leu 
820 825 830 

Gin Arg Ala Val Arg Ala He Leu His He Pro Arg Arg He Arg Gin 
835 840 845 

Gly Leu Glu Arg Ala Leu Leu 
850 855 
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IS g^AJCMED IS: 

1. A substantially pure preparation of a molec- 
ular clone capable of yielding after transfection into 
recipient cells active cultures of the Human Iimunodefi- 
jciency Virus Type 1 (HIV-l) virus strain MN-STl, having 
the identifying characteristics of ATCC 40889. 

2. A substantially pure preparation of DNA 
containing the envelope and rev coding sequences of the 
(HIV-1) virus strain BA-L, having the identifying charac- 
teristics of ATCC 40890. 

3. A DNA segment encoding an envelope (env) 
protein of MN-STl. 

4. The DNA segment according to claim 3 having 
the sequence given in Table III. 

5. A DNA segment encoding an env protein of BA- 

L. 

6. A DNA segment according to claim 5 having the 
sequence given in Table III. 

7. A purified MN-STl env protein. 

8. The protein according to claim 7 having the 
sequence given in Table II. 

9. A purified BA-L protein. 

10. The protein according to claim 9 having the 
sequence given in Table III. 

11. A DNA construct comprising: 

i) the DNA segment according to claim 3; 
and 

ii) a vector. 

12. The DNA construct according to claim 11 
further comprising a DNA segment encoding a rev protein 
and a rev-responsive region. 

13. A DNA construct comprising: 

i) the DNA segment according to claim 5; 
and 

ii) a vector. 

14. The DNA construct according to claim 13 
further comprising a DNA segment encoding a rev protein 
and a £ev-responsive region. 
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15. A recombinant ly produced MN-STl env protein. 

16. A recombinant ly produced BA-L env protein. 

17. A host cell stably transformed with said 
recombinant DNA construct according to claim 11 or claim 
13, in a manner allowing expression of said viral protein 
encoded in said recombinant DNA molecule. 

18. A method of producing a recombinant HIV*1 
vinis strain MN-STl protein comprising culturing said host 
cells according to claim 17, in a manner allowing expres- 
sion of said viral protein and isolating said viral 
protein. 

19. A vaccine for mammals against HIV-1 infection 
comprising a non-infectious antigenic portion of said MN- 
STl virus strain according to claim 1, in an amount 
sufficient to induce immunization against said infection, 
and a pharmaceutically acceptable carrier. 

20. A vaccine for mammals against HIV- infection 
comprising a non-infectious antigenic portion of said BA-L 
virus strain according to claim 2 in an amount sufficient 
to induce immiuization against said infection, and a 
phairmaceutically acceptable carrier. 

21. The vaccine according to claim 19 or claim 20 
which further comprises an adjuvant. 

22. A vaccine for mammals against HIV-1 infection 
comprising at least 5 amino acids of a MN-STl virus strain 
env protein, in an amotmt sufficient to induce immuniza- 
tion against said infection, and a pharmaceutically 
acceptable carrier. 

23. A vaccine for mammals against HIV-1 infection 
comprising at least 5 amino acids of a BA-L virus strain 
env protein, in an amount sufficient to induce immuniza- 
tion against said infection, and a pharmaceutically 
acceptable carrier. 

24. The vaccine according to claim 22 or 23 
wherein said protein is a recombinantly produced protein. 

25. A method of testing candidate vaccines 
against HIV-1 infection comprising administering said 
vaccine and the MN-STl virus strain according to claim 1, 
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to a test mammal and detecting the presence or absence of 
said infection. 

26. A method of screening drugs for their ability 
to effect HIV-1 activity comprising contacting host cells 
according to claim 17, with said drug under conditions 
such that said activity of said virus can be effected. 

27. A bioassay for the detection of 

RIV*-1 in a biological sample comprising the steps of: 

i) coating a surface with at least 5 amino 
acids of a env protein from MN-STl or BA-L virus; 

ii) contacting said coated surface with said 

ssunple; and 

iii) detecting the presence or absence of a 
cos^lex formed between said protein and antibodies 
specific therefor present in saTd sample. 
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